跨组组合和过滤

Combining across and filter in groups

我只想 filter x1、x2 和 x3 值与第 5 和第 95 分位数之间的距离(id)。但是我没有成功地将 across 与我的变量(x1、x2 和 x3)结合起来,在我的示例中:

library(dplyr)

data <- tibble::tibble(id= paste0(rep("sample_",length(100)),rep(1:10,10)),x1 = rnorm(100),x2 = rnorm(100),x3 = rnorm(100))

data %>%
  group_by(id) %>%
  dplyr::filter(across(x1:x3, function(x) x > quantile(x, 0.05) 
                x < quantile(x, 0.95)))
#Error: Problem with `filter()` input `..1`.
#i Input `..1` is `across(...)`.
#i The error occurred in group 1: id = "sample_1".

你忘记了&两个条件之间:

library(dplyr)

data <- tibble::tibble(id= paste0(rep("sample_",length(100)),rep(1:10,10)),x1 = rnorm(100),x2 = rnorm(100),x3 = rnorm(100))

data %>%
  group_by(id) %>%
  dplyr::filter(across(.cols = x1:x3, function(x) x > quantile(x, 0.05) & 
                       x < quantile(x, 0.95)))

   id            x1      x2      x3
   <chr>      <dbl>   <dbl>   <dbl>
 1 sample_2 -0.0222 -1.17   -0.634 
 2 sample_4 -0.584   0.400  -1.01  
 3 sample_8 -0.462  -0.890   0.851 
 4 sample_1  1.39   -0.0418 -1.31  
 5 sample_2 -0.446   1.61   -0.0368
 6 sample_3  0.617  -0.148  -0.358 
 7 sample_4 -1.20    0.340   0.0903
 8 sample_6 -0.538  -1.10   -0.387 
 9 sample_9 -0.680   0.195  -1.51  
10 sample_5 -0.779   0.419   0.720 

如果您更改代码以在每个条件之间使用 &(“AND”),您的函数将 运行。

data %>%
  group_by(id) %>%
  dplyr::filter(across(x1:x3, function(x) x > quantile(x, 0.05) & x < quantile(x, 0.95)))

您还可以缩短代码:

data %>%
  group_by(id) %>%
  filter(across(x1:x3, ~ .x > quantile(.x, 0.05) & .x < quantile(.x, 0.95)))

但是,我认为 filter 旨在与 if_allif_any 一起使用(在 dplyr 1.0.4 中引入;参见 here ),具体取决于您是希望所有选定的列还是任何选定的列满足条件。

例如:

data %>%
  group_by(id) %>%
  filter(if_all(x1:x3, ~ .x > quantile(.x, 0.05) & .x < quantile(.x, 0.95)))

data %>%
  group_by(id) %>%
  filter(if_any(x1:x3, ~ .x > quantile(.x, 0.05) & .x < quantile(.x, 0.95)))

在您的情况下,if_allacross 给出相同的结果,但我不确定 across 是否保证始终与 if_all 的行为相同.