Evaluating ... 当其他函数参数默认为 NULL 时

Question

我想提供一个面向用户的函数，它允许将任意分组变量传递给汇总函数，并可以选择指定额外的过滤参数，但默认情况下是 NULL（并且因此未评估）。

我明白为什么下面的例子会失败（因为 homeworld 属于哪里是模棱两可的，而另一个 arg 优先），但我不确定在这种情况下适当传递点的最佳方法是什么.理想情况下，下面第二次和第三次调用 fun 的结果会 return 相同的结果。

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fun <- function(.df, .species = NULL, ...) {

  .group_vars <- rlang::ensyms(...)

  if (!is.null(.species)) {
    .df <- .df %>%
      dplyr::filter(.data[["species"]] %in% .species)  
  }

  .df %>%
    dplyr::group_by(!!!.group_vars) %>%
    dplyr::summarize(
      ht = mean(.data[["height"]], na.rm = TRUE),
      .groups = "drop"
    )

}

fun(.df = starwars, .species = c("Human", "Droid"), species, homeworld)
#> # A tibble: 19 x 3
#>    species homeworld       ht
#>    <chr>   <chr>        <dbl>
#>  1 Droid   Naboo          96 
#>  2 Droid   Tatooine      132 
#>  3 Droid   <NA>          148 
#>  4 Human   Alderaan      176.
#>  5 Human   Bespin        175 
#>  6 Human   Bestine IV    180 
#>  7 Human   Chandrila     150 
#>  8 Human   Concord Dawn  183 
#>  9 Human   Corellia      175 
#> 10 Human   Coruscant     168.
#> 11 Human   Eriadu        180 
#> 12 Human   Haruun Kal    188 
#> 13 Human   Kamino        183 
#> 14 Human   Naboo         168.
#> 15 Human   Serenno       193 
#> 16 Human   Socorro       177 
#> 17 Human   Stewjon       182 
#> 18 Human   Tatooine      179.
#> 19 Human   <NA>          193
fun(.df = starwars, .species = NULL, homeworld)
#> # A tibble: 49 x 2
#>    homeworld         ht
#>    <chr>          <dbl>
#>  1 Alderaan        176.
#>  2 Aleen Minor      79 
#>  3 Bespin          175 
#>  4 Bestine IV      180 
#>  5 Cato Neimoidia  191 
#>  6 Cerea           198 
#>  7 Champala        196 
#>  8 Chandrila       150 
#>  9 Concord Dawn    183 
#> 10 Corellia        175 
#> # … with 39 more rows
fun(.df = starwars, homeworld)
#> Error in fun(.df = starwars, homeworld): object 'homeworld' not found


<sup>Created on 2020-06-15 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>

我知道我可以通过以下方式达到预期的效果：

fun <- function(.df, .species = NULL, .groups = NULL) {

  .group_vars <- rlang::syms(purrr::map(.groups, rlang::as_string))

...

}

但我正在寻找使用 ... 的解决方案，或者允许用户将字符串或符号传递给 .groups，例如.groups = c(species, homeworld) 或 .groups = c("species", "homeworld").

Answer 1

您可以移动参数，使 .species 位于点之后。

fun <- function(.df, ..., .species = NULL) {

    .group_vars <- rlang::ensyms(...)

    if (!is.null(.species)) {
        .df <- .df %>%
            dplyr::filter(.data[["species"]] %in% .species)  
    }

    .df %>%
        dplyr::group_by(!!!.group_vars) %>%
        dplyr::summarize(
            ht = mean(.data[["height"]], na.rm = TRUE),
            .groups = "drop"
        )

}

fun(.df = starwars, homeworld)

这给出了

> fun(.df = starwars, homeworld)
# A tibble: 49 x 3
   homeworld         ht .groups
   <chr>          <dbl> <chr>  
 1 NA              139. drop   
 2 Alderaan        176. drop   
 3 Aleen Minor      79  drop   
 4 Bespin          175  drop   
 5 Bestine IV      180  drop   
 6 Cato Neimoidia  191  drop   
 7 Cerea           198  drop   
 8 Champala        196  drop   
 9 Chandrila       150  drop   
10 Concord Dawn    183  drop   
# ... with 39 more rows

这就是你想要发生的事情。其他示例仍然有效。

Evaluating ... 当其他函数参数默认为 NULL 时

Evaluating ... when other function arguments are NULL by default

r

dplyr

rlang