如何将列传递给 arrange() 和 mutate()
How to pass column to arrange() and mutate()
我想要一个使用 dplyr
并且看起来像下面 AddPercentColumns()
的函数。
AddPercentColumns <- function(df, col) {
# Sorts and adds "Percent" and "Cumulative Percent" columns to a data.frame.
#
# Args:
# df: data frame
# col: column symbol
#
# Returns:
# Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.
df %>%
arrange(desc(col)) %>%
mutate(Percent = col / sum(col) * 100) %>%
mutate(Cumulative = cumsum(Percent))
}
但是,我无法思考如何解决 NSE。我可能会传入列名字符串并使用 arrange_()
和 mutate_()
,但我不确定如何处理 desc()
、sum()
和 cumsum()
.
使用dplyr
这个函数应该怎么写?
我发现 sprintf()
比 paste()
更容易阅读。下面的函数似乎调试起来很有趣,但它完成了工作。
AddPercentColumn <- function(df, col) {
# Sorts data.frame and adds "Percent" and "Cumulative Percent" columns.
#
# Args:
# df: data frame
# col: column name string
#
# Returns:
# Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.
df %>%
arrange_(sprintf("desc(%s)", col)) %>%
mutate_(Percent = sprintf("%s / sum(%s) * 100", col, col)) %>%
mutate_(Cumulative = "cumsum(Percent)")
}
虽然不是很干净...
根据 Konrad 的建议,我正在发布另一个解决方案。 :)
AddPercentColumns <- function(df, col) {
# Sorts data.frame and adds "Percent" and "Cumulative Percent" columns.
#
# Args:
# df: data frame
# col: unevaluated column symbol e.g. substitute(col)
#
# Returns:
# Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.
df %>%
arrange_(bquote(desc(.(col)))) %>%
mutate_(Percent = bquote(.(col) / sum(.(col)) * 100)) %>%
mutate(Cumulative = cumsum(Percent))
}
绝对更干净、更可调试和可读。
我想要一个使用 dplyr
并且看起来像下面 AddPercentColumns()
的函数。
AddPercentColumns <- function(df, col) {
# Sorts and adds "Percent" and "Cumulative Percent" columns to a data.frame.
#
# Args:
# df: data frame
# col: column symbol
#
# Returns:
# Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.
df %>%
arrange(desc(col)) %>%
mutate(Percent = col / sum(col) * 100) %>%
mutate(Cumulative = cumsum(Percent))
}
但是,我无法思考如何解决 NSE。我可能会传入列名字符串并使用 arrange_()
和 mutate_()
,但我不确定如何处理 desc()
、sum()
和 cumsum()
.
使用dplyr
这个函数应该怎么写?
我发现 sprintf()
比 paste()
更容易阅读。下面的函数似乎调试起来很有趣,但它完成了工作。
AddPercentColumn <- function(df, col) {
# Sorts data.frame and adds "Percent" and "Cumulative Percent" columns.
#
# Args:
# df: data frame
# col: column name string
#
# Returns:
# Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.
df %>%
arrange_(sprintf("desc(%s)", col)) %>%
mutate_(Percent = sprintf("%s / sum(%s) * 100", col, col)) %>%
mutate_(Cumulative = "cumsum(Percent)")
}
虽然不是很干净...
根据 Konrad 的建议,我正在发布另一个解决方案。 :)
AddPercentColumns <- function(df, col) {
# Sorts data.frame and adds "Percent" and "Cumulative Percent" columns.
#
# Args:
# df: data frame
# col: unevaluated column symbol e.g. substitute(col)
#
# Returns:
# Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.
df %>%
arrange_(bquote(desc(.(col)))) %>%
mutate_(Percent = bquote(.(col) / sum(.(col)) * 100)) %>%
mutate(Cumulative = cumsum(Percent))
}
绝对更干净、更可调试和可读。