在 mutate 和 across 之后使用 ~separate

Question

目标是将所有物种“setosa”行转换为一行“setosa”:(这是一个最小的例子（在更多的列和更多的组中）：

我有这个数据框：

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species)

  Sepal.Length Sepal.Width Species
         <dbl>       <dbl> <fct>  
1          5.1         3.5 setosa 
2          4.9         3   setosa

我使用 summarise 和 toString 得到：

  Species Sepal.Length Sepal.Width
  <fct>   <chr>        <chr>      
1 setosa  5.1, 4.9     3.5, 3

预期输出：我想要这个数据框：

  Species Sepal.Length1 Sepal.Length2 Sepal.Width1 Sepal.Width2
  <fct>           <dbl>         <dbl>        <dbl>        <int>
1 setosa            5.1           4.9          3.5            3

我用这个工作代码实现了这个：

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~toString(.))) %>% 
  ungroup() %>% 
  separate(Sepal.Length, c("Sepal.Length1", "Sepal.Length2"),  sep = ", ", convert = TRUE) %>% 
  separate(Sepal.Width, c("Sepal.Width1", "Sepal.Width2"),  sep = ", ", convert = TRUE)

但是我希望能够在 mutate across 之后使用 separate 和匿名函数以及这个无效代码：

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~toString(.))) %>% 
  ungroup() %>% 
  mutate(across(-1, ~separate(., into = paste0(., 1:2), sep = ", ", convert = TRUE)))

Error: Problem with `mutate()` input `..1`.
i `..1 = across(-1, ~separate(., into = paste0(., 1:2), sep = ", ", convert = TRUE))`.
x no applicable method for 'separate' applied to an object of class "character"

我想学习如何在mutate和across之后应用separate函数。

Answer 1

主要问题是 separate 需要输入 data.frame。我们可以包装在 tibble 中，然后如果我们想要在 across 中执行 separate，最后 unnest list 输出

library(dplyr)
library(tidyr)
library(stringr)
head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~toString(.)), .groups = 'drop') %>%
  mutate(across(-1, ~ list(tibble(col1 = .) %>% 
        separate(col1, into = str_c(cur_column(), 1:2), sep = ",\s+")))) %>% 
  unnest(cols = c(Sepal.Length, Sepal.Width))

-输出

# A tibble: 1 × 5
  Species Sepal.Length1 Sepal.Length2 Sepal.Width1 Sepal.Width2
  <fct>   <chr>         <chr>         <chr>        <chr>       
1 setosa  5.1           4.9           3.5          3

Answer 2

另一种方法，转长，转长，再转宽。

library(tidyverse)
head(iris, 2) %>%
  select(1,2,5) %>% 

  pivot_longer(-Species) %>%
  group_by(name) %>% mutate(col = paste0(name, row_number())) %>% ungroup() %>%
  select(-name) %>%
  arrange(col) %>%  # for ordering columns like OP
  pivot_wider(names_from = col, values_from = value)


# A tibble: 1 x 5
  Species Sepal.Length1 Sepal.Length2 Sepal.Width1 Sepal.Width2
  <fct>           <dbl>         <dbl>        <dbl>        <dbl>
1 setosa            5.1           4.9          3.5            3

Answer 3

另一个解决方案：

library(tidyverse)

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~toString(.))) %>% 
  separate(2, into = paste0("Sepal.Length",1:2),  sep=", ") %>% 
  separate(4, into = paste0("Sepal.Width",1:2),  sep=", ")

#> # A tibble: 1 × 5
#>   Species Sepal.Length1 Sepal.Length2 Sepal.Width1 Sepal.Width2
#>   <fct>   <chr>         <chr>         <chr>        <chr>       
#> 1 setosa  5.1           4.9           3.5          3

在 mutate 和 across 之后使用 ~separate

Use ~separate after mutate and across

r

dplyr

tidyr

across