具有多个输入和跨列自定义函数的 curly curly tidy 评估编程
curly curly tidy evaluation programming with multiple inputs and custom function across columns
我的问题与 类似,但我需要跨列应用更复杂的函数,而且我不知道如何将 Lionel 建议的解决方案应用到具有作用域动词的自定义函数,例如 filter_at()
或 filter()
+across()
等价物。它看起来不像是引入了“superstache”/{{{}}}
运算符。
这是我想做的事情的非编程示例(不使用 NSE):
library(dplyr)
library(magrittr)
foo <- tibble(group = c(1,1,2,2,3,3),
a = c(1,1,0,1,2,2),
b = c(1,1,2,2,0,1))
foo %>%
group_by(group) %>%
filter_at(vars(a,b), any_vars(n_distinct(.) != 1)) %>%
ungroup
#> # A tibble: 4 x 3
#> group a b
#> <dbl> <dbl> <dbl>
#> 1 2 0 2
#> 2 2 1 2
#> 3 3 2 0
#> 4 3 2 1
我还没有找到这个 filter_at
行与 filter
+across()
的等价物,但是由于新的(ish)tidyeval 函数早于 dplyr 1.0 我认为这个问题可以放在一边。这是我尝试制作一个编程版本,其中过滤变量由用户提供带点:
my_function <- function(data, ..., by) {
dots <- enquos(..., .named = TRUE)
helperfunc <- function(arg) {
return(any_vars(n_distinct(arg) != length(arg)))
}
dots <- lapply(dots, function(dot) call("helperfunc", dot))
data %>%
group_by({{ by }}) %>%
filter(!!!dots) %>%
ungroup
}
foo %>%
my_function(a, b, group)
#> Error: Problem with `filter()` input `..1`.
#> x Input `..1` is named.
#> i This usually means that you've used `=` instead of `==`.
#> i Did you mean `a == helperfunc(a)`?
如果有一种方法可以在 filter_at
的 vars()
参数中插入一个 NSE 运算符,而不必进行所有这些额外的调用,我会很高兴(我假设这就是{{{}}}
函数可以吗?)
也许我误解了问题所在,但 forwarding the dots 的标准模式在这里似乎工作正常:
my_function <- function(data, ..., by) {
data %>%
group_by({{ by }}) %>%
filter_at(vars(...), any_vars(n_distinct(.) != 1)) %>%
ungroup
}
foo %>%
my_function( a, b, by=group ) # works
选项across
my_function <- function(data, by, ...) {
dots <- enquos(..., .named = TRUE)
nm1 <- purrr::map_chr(dots, rlang::as_label)
data %>%
dplyr::group_by({{ by }}) %>%
dplyr::mutate(across(nm1, ~ n_distinct(.) !=1, .names = "{col}_ind")) %>%
dplyr::ungroup() %>%
dplyr::filter(dplyr::select(., ends_with('ind')) %>% purrr::reduce(`|`)) %>%
dplyr::select(-ends_with('ind'))
}
my_function(foo, group, a, b)
# A tibble: 4 x 3
# group a b
# <dbl> <dbl> <dbl>
#1 2 0 2
#2 2 1 2
#3 3 2 0
#4 3 2 1
或者用filter/across
foo %>%
group_by(group) %>%
filter(any(!across(c(a,b), ~ n_distinct(.) == 1)))
# A tibble: 4 x 3
# Groups: group [2]
# group a b
# <dbl> <dbl> <dbl>
#1 2 0 2
#2 2 1 2
#3 3 2 0
#4 3 2 1
vignette("colwise")
.
中介绍了一种使用 across()
实现此目的的方法
my_function <- function(data, vars, by) {
data %>%
group_by({{ by }}) %>%
filter(n_distinct(across({{ vars }}, ~ .x)) != 1) %>%
ungroup()
}
foo %>%
my_function(c(a, b), by = group)
# A tibble: 4 x 3
group a b
<dbl> <dbl> <dbl>
1 2 0 2
2 2 1 2
3 3 2 0
4 3 2 1
我的问题与 filter_at()
或 filter()
+across()
等价物。它看起来不像是引入了“superstache”/{{{}}}
运算符。
这是我想做的事情的非编程示例(不使用 NSE):
library(dplyr)
library(magrittr)
foo <- tibble(group = c(1,1,2,2,3,3),
a = c(1,1,0,1,2,2),
b = c(1,1,2,2,0,1))
foo %>%
group_by(group) %>%
filter_at(vars(a,b), any_vars(n_distinct(.) != 1)) %>%
ungroup
#> # A tibble: 4 x 3
#> group a b
#> <dbl> <dbl> <dbl>
#> 1 2 0 2
#> 2 2 1 2
#> 3 3 2 0
#> 4 3 2 1
我还没有找到这个 filter_at
行与 filter
+across()
的等价物,但是由于新的(ish)tidyeval 函数早于 dplyr 1.0 我认为这个问题可以放在一边。这是我尝试制作一个编程版本,其中过滤变量由用户提供带点:
my_function <- function(data, ..., by) {
dots <- enquos(..., .named = TRUE)
helperfunc <- function(arg) {
return(any_vars(n_distinct(arg) != length(arg)))
}
dots <- lapply(dots, function(dot) call("helperfunc", dot))
data %>%
group_by({{ by }}) %>%
filter(!!!dots) %>%
ungroup
}
foo %>%
my_function(a, b, group)
#> Error: Problem with `filter()` input `..1`.
#> x Input `..1` is named.
#> i This usually means that you've used `=` instead of `==`.
#> i Did you mean `a == helperfunc(a)`?
如果有一种方法可以在 filter_at
的 vars()
参数中插入一个 NSE 运算符,而不必进行所有这些额外的调用,我会很高兴(我假设这就是{{{}}}
函数可以吗?)
也许我误解了问题所在,但 forwarding the dots 的标准模式在这里似乎工作正常:
my_function <- function(data, ..., by) {
data %>%
group_by({{ by }}) %>%
filter_at(vars(...), any_vars(n_distinct(.) != 1)) %>%
ungroup
}
foo %>%
my_function( a, b, by=group ) # works
选项across
my_function <- function(data, by, ...) {
dots <- enquos(..., .named = TRUE)
nm1 <- purrr::map_chr(dots, rlang::as_label)
data %>%
dplyr::group_by({{ by }}) %>%
dplyr::mutate(across(nm1, ~ n_distinct(.) !=1, .names = "{col}_ind")) %>%
dplyr::ungroup() %>%
dplyr::filter(dplyr::select(., ends_with('ind')) %>% purrr::reduce(`|`)) %>%
dplyr::select(-ends_with('ind'))
}
my_function(foo, group, a, b)
# A tibble: 4 x 3
# group a b
# <dbl> <dbl> <dbl>
#1 2 0 2
#2 2 1 2
#3 3 2 0
#4 3 2 1
或者用filter/across
foo %>%
group_by(group) %>%
filter(any(!across(c(a,b), ~ n_distinct(.) == 1)))
# A tibble: 4 x 3
# Groups: group [2]
# group a b
# <dbl> <dbl> <dbl>
#1 2 0 2
#2 2 1 2
#3 3 2 0
#4 3 2 1
vignette("colwise")
.
across()
实现此目的的方法
my_function <- function(data, vars, by) {
data %>%
group_by({{ by }}) %>%
filter(n_distinct(across({{ vars }}, ~ .x)) != 1) %>%
ungroup()
}
foo %>%
my_function(c(a, b), by = group)
# A tibble: 4 x 3
group a b
<dbl> <dbl> <dbl>
1 2 0 2
2 2 1 2
3 3 2 0
4 3 2 1