如何将列传递给 arrange() 和 mutate()

Question

我想要一个使用 dplyr 并且看起来像下面 AddPercentColumns() 的函数。

AddPercentColumns <- function(df, col) {
    # Sorts and adds "Percent" and "Cumulative Percent" columns to a data.frame.
    #
    # Args:
    #   df: data frame
    #   col: column symbol
    #
    # Returns:
    #   Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.

    df %>%
        arrange(desc(col)) %>%
        mutate(Percent = col / sum(col) * 100) %>% 
        mutate(Cumulative = cumsum(Percent))
}

但是，我无法思考如何解决 NSE。我可能会传入列名字符串并使用 arrange_() 和 mutate_()，但我不确定如何处理 desc()、sum() 和 cumsum().

使用dplyr这个函数应该怎么写？

Answer 1

我发现 sprintf() 比 paste() 更容易阅读。下面的函数似乎调试起来很有趣，但它完成了工作。

AddPercentColumn <- function(df, col) {
    # Sorts data.frame and adds "Percent" and "Cumulative Percent" columns.
    #
    # Args:
    #   df: data frame
    #   col: column name string
    #
    # Returns:
    #   Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.

    df %>%
        arrange_(sprintf("desc(%s)", col)) %>%
        mutate_(Percent = sprintf("%s / sum(%s) * 100", col, col)) %>% 
        mutate_(Cumulative = "cumsum(Percent)")
}

虽然不是很干净...

Answer 2

根据 Konrad 的建议，我正在发布另一个解决方案。 :)

AddPercentColumns <- function(df, col) {
    # Sorts data.frame and adds "Percent" and "Cumulative Percent" columns.
    #
    # Args:
    #   df: data frame
    #   col: unevaluated column symbol e.g. substitute(col)
    #
    # Returns:
    #   Data frame sorted by "col" with new "Percent" and "Cumulative Percent" columns.

    df %>%
        arrange_(bquote(desc(.(col)))) %>%
        mutate_(Percent = bquote(.(col) / sum(.(col)) * 100)) %>% 
        mutate(Cumulative = cumsum(Percent))
}

绝对更干净、更可调试和可读。

如何将列传递给 arrange() 和 mutate()

How to pass column to arrange() and mutate()

r

dplyr