对可变数量的参数使用点时,dplyr 中的非标准评估

Non-standard evaluation in dplyr when using dots for variable number of arguments

我正在尝试编写一个可以在 dplyr 管道中使用的函数。它应该将任意数量的列作为参数,并仅替换这些列中的某些子字符串。下面是我目前所拥有的一个简单示例。

library(tidyverse)

tib <- tibble(
  x = c("cats and dogs", "foxes and hounds"),
  y = c("whales and dolphins", "cats and foxes"),
  z = c("dogs and geese", "cats and mice")
)

filter_words <- function(.data, ...) {
  words_to_filter <- c("cat", "dog")

  .data %>% mutate(
    across(..., ~ gsub(
        paste0(words_to_filter, collapse = "|"),
        "#@!*", ., perl = TRUE
      )
    )
  )
}

filtered_tib <- tib %>%
  filter_words(x, y)

如果这有效,我希望:

x                 y                    z
#@!*s and #@!*s   whales and dolphins  dogs and geese
foxes and hounds  #@!*s and foxes      cats and mice

但是我得到一个错误:

Error: Can't splice an object of type `closure` because it is not a vector
Run `rlang::last_error()` to see where the error occurred.
Called from: signal_abort(cnd)

我尝试了多种非标准评估的组合,如从 tidyverse docs 和 SO 上的许多问题中收集的那样,并且看到了几乎同样多的不同错误!任何人都可以帮助完成这项工作吗?如果我用 everything() 替换点,它 确实 有效,但这不适合我只过滤某些列的用例。

如果您使用的是最新的tidyverse,现在推荐的方法是使用{{ }}运算符来立即化解across.cols的参数。像这样

filter_words <- function(.data, .mycols) {
  words_to_filter <- c("cat", "dog")

  .data %>% mutate(
    across({{ .mycols }}, ~ gsub(
        paste0(words_to_filter, collapse = "|"),
        "#@!*", ., perl = TRUE
      )
    )
  )
}

tib %>% filter_words(c(x, y))

然后您可以将 .mycols 视为 across 的第一个参数,并使用您想要的任何 tidy-select。输出是

# A tibble: 2 x 3
  x                y                   z             
  <chr>            <chr>               <chr>         
1 #@!*s and #@!*s  whales and dolphins dogs and geese
2 foxes and hounds #@!*s and foxes     cats and mice 

您可以使用 match.call 来捕捉圆点 (...)。

library(dplyr)

filter_words <- function(.data, ...) {
  words_to_filter <- c("cat", "dog")
  args <- as.character(match.call(expand.dots = FALSE)$`...`)
  .data %>% mutate(
    across(all_of(args), ~ gsub(
      paste0(words_to_filter, collapse = "|"),
      "#@!*", ., perl = TRUE
    )
    )
  )
}

tib %>% filter_words(x, y)

#   x                y                   z             
#  <chr>            <chr>               <chr>         
#1 #@!*s and #@!*s  whales and dolphins dogs and geese
#2 foxes and hounds #@!*s and foxes     cats and mice 

tib %>% filter_words(x)

# A tibble: 2 x 3
  x                y                   z             
#  <chr>            <chr>               <chr>         
#1 #@!*s and #@!*s  whales and dolphins dogs and geese
#2 foxes and hounds cats and foxes      cats and mice 

在您的函数中,across(..., 应该改为 across(c(...),

library(dplyr, warn.conflicts = FALSE)
sessionInfo()$otherPkgs$dplyr$Version
#> [1] "1.0.7"

tib <- tibble(
  x = c("cats and dogs", "foxes and hounds"),
  y = c("whales and dolphins", "cats and foxes"),
  z = c("dogs and geese", "cats and mice")
)

filter_words <- function(.data, ...) {
  words_to_filter <- c("cat", "dog")

  .data %>% mutate(
    across(c(...), ~ gsub(
        paste0(words_to_filter, collapse = "|"),
        "#@!*", ., perl = TRUE
      )
    )
  )
}

tib %>%
  filter_words(x, y)
#> # A tibble: 2 × 3
#>   x                y                   z             
#>   <chr>            <chr>               <chr>         
#> 1 #@!*s and #@!*s  whales and dolphins dogs and geese
#> 2 foxes and hounds #@!*s and foxes     cats and mice

reprex package (v2.0.1)

创建于 2022-01-17