Evaluating ... 当其他函数参数默认为 NULL 时
Evaluating ... when other function arguments are NULL by default
我想提供一个面向用户的函数,它允许将任意分组变量传递给汇总函数,并可以选择指定额外的过滤参数,但默认情况下是 NULL
(并且因此未评估)。
我明白为什么下面的例子会失败(因为 homeworld
属于哪里是模棱两可的,而另一个 arg 优先),但我不确定在这种情况下适当传递点的最佳方法是什么.理想情况下,下面第二次和第三次调用 fun
的结果会 return 相同的结果。
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
fun <- function(.df, .species = NULL, ...) {
.group_vars <- rlang::ensyms(...)
if (!is.null(.species)) {
.df <- .df %>%
dplyr::filter(.data[["species"]] %in% .species)
}
.df %>%
dplyr::group_by(!!!.group_vars) %>%
dplyr::summarize(
ht = mean(.data[["height"]], na.rm = TRUE),
.groups = "drop"
)
}
fun(.df = starwars, .species = c("Human", "Droid"), species, homeworld)
#> # A tibble: 19 x 3
#> species homeworld ht
#> <chr> <chr> <dbl>
#> 1 Droid Naboo 96
#> 2 Droid Tatooine 132
#> 3 Droid <NA> 148
#> 4 Human Alderaan 176.
#> 5 Human Bespin 175
#> 6 Human Bestine IV 180
#> 7 Human Chandrila 150
#> 8 Human Concord Dawn 183
#> 9 Human Corellia 175
#> 10 Human Coruscant 168.
#> 11 Human Eriadu 180
#> 12 Human Haruun Kal 188
#> 13 Human Kamino 183
#> 14 Human Naboo 168.
#> 15 Human Serenno 193
#> 16 Human Socorro 177
#> 17 Human Stewjon 182
#> 18 Human Tatooine 179.
#> 19 Human <NA> 193
fun(.df = starwars, .species = NULL, homeworld)
#> # A tibble: 49 x 2
#> homeworld ht
#> <chr> <dbl>
#> 1 Alderaan 176.
#> 2 Aleen Minor 79
#> 3 Bespin 175
#> 4 Bestine IV 180
#> 5 Cato Neimoidia 191
#> 6 Cerea 198
#> 7 Champala 196
#> 8 Chandrila 150
#> 9 Concord Dawn 183
#> 10 Corellia 175
#> # … with 39 more rows
fun(.df = starwars, homeworld)
#> Error in fun(.df = starwars, homeworld): object 'homeworld' not found
<sup>Created on 2020-06-15 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>
我知道我可以通过以下方式达到预期的效果:
fun <- function(.df, .species = NULL, .groups = NULL) {
.group_vars <- rlang::syms(purrr::map(.groups, rlang::as_string))
...
}
但我正在寻找使用 ...
的解决方案,或者允许用户将字符串或符号传递给 .groups
,例如.groups = c(species, homeworld)
或 .groups = c("species", "homeworld")
.
您可以移动参数,使 .species
位于点之后。
fun <- function(.df, ..., .species = NULL) {
.group_vars <- rlang::ensyms(...)
if (!is.null(.species)) {
.df <- .df %>%
dplyr::filter(.data[["species"]] %in% .species)
}
.df %>%
dplyr::group_by(!!!.group_vars) %>%
dplyr::summarize(
ht = mean(.data[["height"]], na.rm = TRUE),
.groups = "drop"
)
}
fun(.df = starwars, homeworld)
这给出了
> fun(.df = starwars, homeworld)
# A tibble: 49 x 3
homeworld ht .groups
<chr> <dbl> <chr>
1 NA 139. drop
2 Alderaan 176. drop
3 Aleen Minor 79 drop
4 Bespin 175 drop
5 Bestine IV 180 drop
6 Cato Neimoidia 191 drop
7 Cerea 198 drop
8 Champala 196 drop
9 Chandrila 150 drop
10 Concord Dawn 183 drop
# ... with 39 more rows
这就是你想要发生的事情。其他示例仍然有效。
我想提供一个面向用户的函数,它允许将任意分组变量传递给汇总函数,并可以选择指定额外的过滤参数,但默认情况下是 NULL
(并且因此未评估)。
我明白为什么下面的例子会失败(因为 homeworld
属于哪里是模棱两可的,而另一个 arg 优先),但我不确定在这种情况下适当传递点的最佳方法是什么.理想情况下,下面第二次和第三次调用 fun
的结果会 return 相同的结果。
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
fun <- function(.df, .species = NULL, ...) {
.group_vars <- rlang::ensyms(...)
if (!is.null(.species)) {
.df <- .df %>%
dplyr::filter(.data[["species"]] %in% .species)
}
.df %>%
dplyr::group_by(!!!.group_vars) %>%
dplyr::summarize(
ht = mean(.data[["height"]], na.rm = TRUE),
.groups = "drop"
)
}
fun(.df = starwars, .species = c("Human", "Droid"), species, homeworld)
#> # A tibble: 19 x 3
#> species homeworld ht
#> <chr> <chr> <dbl>
#> 1 Droid Naboo 96
#> 2 Droid Tatooine 132
#> 3 Droid <NA> 148
#> 4 Human Alderaan 176.
#> 5 Human Bespin 175
#> 6 Human Bestine IV 180
#> 7 Human Chandrila 150
#> 8 Human Concord Dawn 183
#> 9 Human Corellia 175
#> 10 Human Coruscant 168.
#> 11 Human Eriadu 180
#> 12 Human Haruun Kal 188
#> 13 Human Kamino 183
#> 14 Human Naboo 168.
#> 15 Human Serenno 193
#> 16 Human Socorro 177
#> 17 Human Stewjon 182
#> 18 Human Tatooine 179.
#> 19 Human <NA> 193
fun(.df = starwars, .species = NULL, homeworld)
#> # A tibble: 49 x 2
#> homeworld ht
#> <chr> <dbl>
#> 1 Alderaan 176.
#> 2 Aleen Minor 79
#> 3 Bespin 175
#> 4 Bestine IV 180
#> 5 Cato Neimoidia 191
#> 6 Cerea 198
#> 7 Champala 196
#> 8 Chandrila 150
#> 9 Concord Dawn 183
#> 10 Corellia 175
#> # … with 39 more rows
fun(.df = starwars, homeworld)
#> Error in fun(.df = starwars, homeworld): object 'homeworld' not found
<sup>Created on 2020-06-15 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>
我知道我可以通过以下方式达到预期的效果:
fun <- function(.df, .species = NULL, .groups = NULL) {
.group_vars <- rlang::syms(purrr::map(.groups, rlang::as_string))
...
}
但我正在寻找使用 ...
的解决方案,或者允许用户将字符串或符号传递给 .groups
,例如.groups = c(species, homeworld)
或 .groups = c("species", "homeworld")
.
您可以移动参数,使 .species
位于点之后。
fun <- function(.df, ..., .species = NULL) {
.group_vars <- rlang::ensyms(...)
if (!is.null(.species)) {
.df <- .df %>%
dplyr::filter(.data[["species"]] %in% .species)
}
.df %>%
dplyr::group_by(!!!.group_vars) %>%
dplyr::summarize(
ht = mean(.data[["height"]], na.rm = TRUE),
.groups = "drop"
)
}
fun(.df = starwars, homeworld)
这给出了
> fun(.df = starwars, homeworld)
# A tibble: 49 x 3
homeworld ht .groups
<chr> <dbl> <chr>
1 NA 139. drop
2 Alderaan 176. drop
3 Aleen Minor 79 drop
4 Bespin 175 drop
5 Bestine IV 180 drop
6 Cato Neimoidia 191 drop
7 Cerea 198 drop
8 Champala 196 drop
9 Chandrila 150 drop
10 Concord Dawn 183 drop
# ... with 39 more rows
这就是你想要发生的事情。其他示例仍然有效。