如何将多个 group_by 参数和动态变量参数传递给 dplyr 函数
How to pass multiple group_by arguments and a dynamic variable argument to a dplyr function
我正在尝试将多个 group_by 参数传递给 dplyr 函数以及命名变量。据了解,我需要对 dplyr 使用 quosure 来理解我传递给它的变量。以下代码工作正常:
quantileMaker2 <- function(data, groupCol, calcCol) {
groupCol <- enquo(groupCol)
calcCol <- enquo(calcCol)
data %>%
group_by(!! groupCol) %>%
summarise('25%' = currency(quantile(!! calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!! calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!! calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!! calcCol), digits = 2L),
nAgencies = n_distinct('POSIT ID'),
nFTEs = sum(FTEs)
)
}
quantileMaker2(df, employerClass, TCCperFTE)
然而,当我 运行 以下时,我遇到了问题:
quantileMaker3 <- function(data,...,calcCol) {
groupCol <- quos(...)
calcCol <- quo(calcCol)
data %>%
group_by(!!! groupCol) %>%
summarise('25%' = currency(quantile(!! calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!! calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!! calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!! calcCol), digits = 2L),
nAgencies = n_distinct('POSIT ID'),
nFTEs = sum(FTEs)
)
}
其中returns出现以下错误:
Error in summarise_impl(.data, dots) :
Evaluation error: anyNA() applied to non-(list or vector) of type 'symbol'.
示例数据:
Year employerClass TCCperFTE FTEs POSIT ID
2014 One 5000 20 1
2014 Two 1000 30 2
2015 One 15000 40 1
2015 Two 50000 50 2
2016 One 100000 60 1
2016 Two 500000 70 2
非常感谢你们能提供的任何帮助。
您没有提供示例数据,但您的函数在修改为使用 mtcars
数据框后可以正常工作。
library(tidyverse)
library(formattable)
quantileMaker3 <- function(data, calcCol, ...) {
groupCol <- quos(...)
calcCol <- enquo(calcCol)
data %>%
group_by(!!!groupCol) %>%
summarise('25%' = currency(quantile(!!calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!!calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!!calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!!calcCol), digits = 2L),
nAgencies = n_distinct(cyl),
nFTEs = sum(hp)
)
}
quantileMaker3(mtcars, mpg, cyl)
# A tibble: 3 x 7
cyl `25%` `50%` `75%` avg nAgencies nFTEs
<dbl> <S3: formattable> <S3: formattable> <S3: formattable> <S3: formattable> <int> <dbl>
1 4. .80 .00 .40 .66 1 909.
2 6. .65 .70 .00 .74 1 856.
3 8. .40 .20 .25 .10 1 2929.
有多个分组参数:
quantileMaker3(mtcars, mpg, cyl, vs)
# A tibble: 5 x 8
# Groups: cyl [?]
cyl vs `25%` `50%` `75%` avg nAgencies nFTEs
<dbl> <dbl> <S3: formattable> <S3: formattable> <S3: formattable> <S3: formattable> <int> <dbl>
1 4. 0. .00 .00 .00 .00 1 91.
2 4. 1. .80 .85 .40 .73 1 818.
3 6. 0. .35 .00 .00 .57 1 395.
4 6. 1. .03 .65 .75 .12 1 461.
5 8. 0. .40 .20 .25 .10 1 2929.
顺便说一句,您可以通过使用嵌套来避免多次调用分位数。如果任何输出列属于 class formattable
(这是 currency
函数 returns),这将不起作用,所以我将函数更改为创建currency-format 列的字符串。
quantileMaker3 <- function(data, calcCol, ..., quantiles=c(0.25,0.5,0.75)) {
groupCol <- quos(...)
calcCol <- enquo(calcCol)
data %>%
group_by(!!!groupCol) %>%
summarise(values = list(paste0("$", sprintf("%1.2f", quantile(!!calcCol, probs=quantiles)))),
qnames = list(sprintf("%1.0f%%", quantiles*100)),
nAgencies = n_distinct(cyl),
nFTEs = sum(hp),
avg = paste0("$", sprintf("%1.2f", mean(!!calcCol)))
) %>%
unnest %>%
spread(qnames, values)
}
quantileMaker3(mtcars, mpg, cyl, vs)
# A tibble: 5 x 8
# Groups: cyl [3]
cyl vs nAgencies nFTEs avg `25%` `50%` `75%`
<dbl> <dbl> <int> <dbl> <chr> <chr> <chr> <chr>
1 4. 0. 1 91. .00 .00 .00 .00
2 4. 1. 1 818. .73 .80 .85 .40
3 6. 0. 1 395. .57 .35 .00 .00
4 6. 1. 1 461. .12 .03 .65 .75
5 8. 0. 1 2929. .10 .40 .20 .25
我正在尝试将多个 group_by 参数传递给 dplyr 函数以及命名变量。据了解,我需要对 dplyr 使用 quosure 来理解我传递给它的变量。以下代码工作正常:
quantileMaker2 <- function(data, groupCol, calcCol) {
groupCol <- enquo(groupCol)
calcCol <- enquo(calcCol)
data %>%
group_by(!! groupCol) %>%
summarise('25%' = currency(quantile(!! calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!! calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!! calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!! calcCol), digits = 2L),
nAgencies = n_distinct('POSIT ID'),
nFTEs = sum(FTEs)
)
}
quantileMaker2(df, employerClass, TCCperFTE)
然而,当我 运行 以下时,我遇到了问题:
quantileMaker3 <- function(data,...,calcCol) {
groupCol <- quos(...)
calcCol <- quo(calcCol)
data %>%
group_by(!!! groupCol) %>%
summarise('25%' = currency(quantile(!! calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!! calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!! calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!! calcCol), digits = 2L),
nAgencies = n_distinct('POSIT ID'),
nFTEs = sum(FTEs)
)
}
其中returns出现以下错误:
Error in summarise_impl(.data, dots) :
Evaluation error: anyNA() applied to non-(list or vector) of type 'symbol'.
示例数据:
Year employerClass TCCperFTE FTEs POSIT ID
2014 One 5000 20 1
2014 Two 1000 30 2
2015 One 15000 40 1
2015 Two 50000 50 2
2016 One 100000 60 1
2016 Two 500000 70 2
非常感谢你们能提供的任何帮助。
您没有提供示例数据,但您的函数在修改为使用 mtcars
数据框后可以正常工作。
library(tidyverse)
library(formattable)
quantileMaker3 <- function(data, calcCol, ...) {
groupCol <- quos(...)
calcCol <- enquo(calcCol)
data %>%
group_by(!!!groupCol) %>%
summarise('25%' = currency(quantile(!!calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!!calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!!calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!!calcCol), digits = 2L),
nAgencies = n_distinct(cyl),
nFTEs = sum(hp)
)
}
quantileMaker3(mtcars, mpg, cyl)
# A tibble: 3 x 7 cyl `25%` `50%` `75%` avg nAgencies nFTEs <dbl> <S3: formattable> <S3: formattable> <S3: formattable> <S3: formattable> <int> <dbl> 1 4. .80 .00 .40 .66 1 909. 2 6. .65 .70 .00 .74 1 856. 3 8. .40 .20 .25 .10 1 2929.
有多个分组参数:
quantileMaker3(mtcars, mpg, cyl, vs)
# A tibble: 5 x 8 # Groups: cyl [?] cyl vs `25%` `50%` `75%` avg nAgencies nFTEs <dbl> <dbl> <S3: formattable> <S3: formattable> <S3: formattable> <S3: formattable> <int> <dbl> 1 4. 0. .00 .00 .00 .00 1 91. 2 4. 1. .80 .85 .40 .73 1 818. 3 6. 0. .35 .00 .00 .57 1 395. 4 6. 1. .03 .65 .75 .12 1 461. 5 8. 0. .40 .20 .25 .10 1 2929.
顺便说一句,您可以通过使用嵌套来避免多次调用分位数。如果任何输出列属于 class formattable
(这是 currency
函数 returns),这将不起作用,所以我将函数更改为创建currency-format 列的字符串。
quantileMaker3 <- function(data, calcCol, ..., quantiles=c(0.25,0.5,0.75)) {
groupCol <- quos(...)
calcCol <- enquo(calcCol)
data %>%
group_by(!!!groupCol) %>%
summarise(values = list(paste0("$", sprintf("%1.2f", quantile(!!calcCol, probs=quantiles)))),
qnames = list(sprintf("%1.0f%%", quantiles*100)),
nAgencies = n_distinct(cyl),
nFTEs = sum(hp),
avg = paste0("$", sprintf("%1.2f", mean(!!calcCol)))
) %>%
unnest %>%
spread(qnames, values)
}
quantileMaker3(mtcars, mpg, cyl, vs)
# A tibble: 5 x 8 # Groups: cyl [3] cyl vs nAgencies nFTEs avg `25%` `50%` `75%` <dbl> <dbl> <int> <dbl> <chr> <chr> <chr> <chr> 1 4. 0. 1 91. .00 .00 .00 .00 2 4. 1. 1 818. .73 .80 .85 .40 3 6. 0. 1 395. .57 .35 .00 .00 4 6. 1. 1 461. .12 .03 .65 .75 5 8. 0. 1 2929. .10 .40 .20 .25