如何将 filter(across(starts_with("foo"), ~ . logical-condition)) 与 mutate(bar = map2(...)) 结合起来?
How to combine filter(across(starts_with("foo"), ~ . logical-condition)) with mutate(bar = map2(...))?
我想将 dplyr
的 filter()
与 starts_with()
等选择助手结合使用。
当前的 post 是 的后续,但在涉及 list-columns 和 map2()
来自 {purrr}
包。
考虑以下 my_mtcars
数据框:
library(tibble)
my_mtcars <-
mtcars %>%
rownames_to_column("cars")
我想过滤任何 开始 with/contains 字符串 "cars"
的列,以仅保留以下汽车:
cars_to_keep <- c("Merc 240D", "Fiat X1-9", "Ferrari Dino")
因此,从 中我们学习了如何将选择助手与 filter()
一起使用,这样:
library(dplyr)
filter(my_mtcars, across(contains("cars"), ~ . %in% cars_to_keep))
## cars mpg cyl disp hp drat wt qsec vs am gear carb
## 1 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.0 1 0 4 2
## 2 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.9 1 1 4 1
## 3 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
到目前为止一切顺利。
以下数据结构出现问题:
higher_level_tibble <-
tibble(my_data = list(my_mtcars),
the_cars_i_want = list(cars_to_keep))
## # A tibble: 1 x 2
## my_data the_cars_i_want
## <list> <list>
## 1 <df [32 x 12]> <chr [3]>
尽管以下有效:
library(purrr)
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~filter(.x, cars %in% .y)))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
这不是:
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), ~ . %in% .y))))
Error: Problem with mutate()
column my_filtered_data
.
i my_filtered_data = map2(...)
.
x Problem with filter()
input ..1
.
i Input ..1
is across(starts_with("cars"), ~. %in% .y)
.
x the ... list contains fewer than 2 elements
我如何在 filter()
中利用 tidyselect
个助手,所有这些都在 purrr::map2()
中?
编辑
期望输出
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data,
.y = the_cars_i_want,
.f = ~ .x %>% filter( from the col in .x whose header starts with "cars", return only values that appear in .y )))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
一个可能的解决方案,使用purrr::pmap_dfr
:
library(tidyverse)
my_mtcars <-
mtcars %>%
rownames_to_column("cars")
cars_to_keep <- c("Merc 240D", "Fiat X1-9", "Ferrari Dino")
higher_level_tibble <-
tibble(my_data = list(my_mtcars),
the_cars_i_want = list(cars_to_keep))
higher_level_tibble %>%
pmap_dfr(~ ..1 %>% filter(across(contains("cars"), \(x) x %in% ..2))) %>%
nest(my_filtered_data = everything()) %>%
bind_cols(higher_level_tibble, .)
#> # A tibble: 1 × 3
#> my_data the_cars_i_want my_filtered_data
#> <list> <list> <list>
#> 1 <df [32 × 12]> <chr [3]> <tibble [3 × 12]>
@Paul Smith 的回答表明我自己的代码不起作用,即
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), ~ . %in% .y))))
可以使用匿名函数修复,例如:
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), function(x) x %in% .y))))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
我想将 dplyr
的 filter()
与 starts_with()
等选择助手结合使用。
当前的 post 是 map2()
来自 {purrr}
包。
考虑以下 my_mtcars
数据框:
library(tibble)
my_mtcars <-
mtcars %>%
rownames_to_column("cars")
我想过滤任何 开始 with/contains 字符串 "cars"
的列,以仅保留以下汽车:
cars_to_keep <- c("Merc 240D", "Fiat X1-9", "Ferrari Dino")
因此,从 filter()
一起使用,这样:
library(dplyr)
filter(my_mtcars, across(contains("cars"), ~ . %in% cars_to_keep))
## cars mpg cyl disp hp drat wt qsec vs am gear carb
## 1 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.0 1 0 4 2
## 2 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.9 1 1 4 1
## 3 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
到目前为止一切顺利。
以下数据结构出现问题:
higher_level_tibble <-
tibble(my_data = list(my_mtcars),
the_cars_i_want = list(cars_to_keep))
## # A tibble: 1 x 2
## my_data the_cars_i_want
## <list> <list>
## 1 <df [32 x 12]> <chr [3]>
尽管以下有效:
library(purrr)
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~filter(.x, cars %in% .y)))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
这不是:
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), ~ . %in% .y))))
Error: Problem with
mutate()
columnmy_filtered_data
.
imy_filtered_data = map2(...)
.
x Problem withfilter()
input..1
.
i Input..1
isacross(starts_with("cars"), ~. %in% .y)
.
x the ... list contains fewer than 2 elements
我如何在 filter()
中利用 tidyselect
个助手,所有这些都在 purrr::map2()
中?
编辑
期望输出
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data,
.y = the_cars_i_want,
.f = ~ .x %>% filter( from the col in .x whose header starts with "cars", return only values that appear in .y )))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>
一个可能的解决方案,使用purrr::pmap_dfr
:
library(tidyverse)
my_mtcars <-
mtcars %>%
rownames_to_column("cars")
cars_to_keep <- c("Merc 240D", "Fiat X1-9", "Ferrari Dino")
higher_level_tibble <-
tibble(my_data = list(my_mtcars),
the_cars_i_want = list(cars_to_keep))
higher_level_tibble %>%
pmap_dfr(~ ..1 %>% filter(across(contains("cars"), \(x) x %in% ..2))) %>%
nest(my_filtered_data = everything()) %>%
bind_cols(higher_level_tibble, .)
#> # A tibble: 1 × 3
#> my_data the_cars_i_want my_filtered_data
#> <list> <list> <list>
#> 1 <df [32 × 12]> <chr [3]> <tibble [3 × 12]>
@Paul Smith 的回答表明我自己的代码不起作用,即
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), ~ . %in% .y))))
可以使用匿名函数修复,例如:
higher_level_tibble %>%
mutate(my_filtered_data = map2(.x = my_data, .y = the_cars_i_want, .f = ~ filter(.x, across(starts_with("cars"), function(x) x %in% .y))))
## # A tibble: 1 x 3
## my_data the_cars_i_want my_filtered_data
## <list> <list> <list>
## 1 <df [32 x 12]> <chr [3]> <df [3 x 12]>