R {dplyr}:`rename` or `mutate` data.frames in `rowwise` list-column with different column names on LHS

R {dplyr}: `rename` or `mutate` data.frames in `rowwise` list-column with different column names on LHS

我正在使用 {dplyr} 1.0.0 使用 data.frameslist-columns,我想知道是否可以 rename()mutate()当嵌套 data.frame 分组时,每个 data.frame 中的列不离开管道 rowwise.

为什么我想知道/做这个?据我了解 {dplyr} 1.0.0 的理念,它推荐 rowwise() 而不是在列上使用 {purrr} 的 map 系列。下面我首先展示我在 {dplyr} 1.0.0 之前所做的事情,然后展示几个 {dplyr} 1.0.0 的例子(大部分都不起作用)。

虽然 {rlang} supports glue strings on the left hand side (LHS) 可以在编写 {dplyr} 自定义函数时使用,但 rowwise tibble 中的 {dplyr} 函数的 LHS 似乎还不支持(at至少我下面的例子不起作用)。

对于 rename,我找到了使用 rename_with() 的方法,但我不知道如何使用 mutate

我也不理解收到的大部分错误消息。他们或多或少地说我在 := 之前没有在 LHS 上使用字符串,但在 rowwise 模式下我引用的列 (new) 实际上是 [=28= 的字符向量].

library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(purrr)

myiris <- iris %>% 
  nest_by(Species, .key = "mydat") %>% 
  ungroup %>% 
  mutate(new = letters[1:3])

# our data looks like this
# we want to use the strings in column `new` on the LHS of `rename` and `mutate`
myiris
#> # A tibble: 3 x 3
#>   Species                 mydat new  
#>   <fct>      <list<tbl_df[,4]>> <chr>
#> 1 setosa               [50 x 4] a    
#> 2 versicolor           [50 x 4] b    
#> 3 virginica            [50 x 4] c

# For reference: under dplyr < 1.0 I did the following:

# rename in pipe
# working
myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ rename_at(.x, "Sepal.Length", function(z) paste(.y)))) %>% 
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>       a Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   5.1         3.5          1.4         0.2
#> 2   4.9         3            1.4         0.2
#> 3   4.7         3.2          1.3         0.2
#> 4   4.6         3.1          1.5         0.2
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>       b Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   7           3.2          4.7         1.4
#> 2   6.4         3.2          4.5         1.5
#> 3   6.9         3.1          4.9         1.5
#> 4   5.5         2.3          4           1.3
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>       c Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   6.3         3.3          6           2.5
#> 2   5.8         2.7          5.1         1.9
#> 3   7.1         3            5.9         2.1
#> 4   6.3         2.9          5.6         1.8
#> # ... with 46 more rows

# mutate in pipe
# was never working even under dplyr < 1.0.0
myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ mutate(.x, eval(.y) := .y))) %>% 
  pull(mydat)
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `map2(mydat, new, ~mutate(.x, `:=`(eval(.y), .y)))`.

# mutate with custom function
# working
mymutate <- function(df, y) {
  mutate(df, !! y := y)
}

myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ mymutate(.x, .y))) %>% 
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width a    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          5.1         3.5          1.4         0.2 a    
#> 2          4.9         3            1.4         0.2 a    
#> 3          4.7         3.2          1.3         0.2 a    
#> 4          4.6         3.1          1.5         0.2 a    
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width b    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          7           3.2          4.7         1.4 b    
#> 2          6.4         3.2          4.5         1.5 b    
#> 3          6.9         3.1          4.9         1.5 b    
#> 4          5.5         2.3          4           1.3 b    
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width c    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          6.3         3.3          6           2.5 c    
#> 2          5.8         2.7          5.1         1.9 c    
#> 3          7.1         3            5.9         2.1 c    
#> 4          6.3         2.9          5.6         1.8 c    
#> # ... with 46 more rows





# dplyr > 1.0.0
# objective: `rename()` or `mutate()` in pipe on list-column of data.frames 
#            while using different column names on LHS coming from another
#            column (here `new`)

myiris_row <- myiris %>% rowwise

# rename --------
# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename({{new}} := "Sepal.Length"))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! new := "Sepal.Length")))  
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! sym(new) := "Sepal.Length")))  
#> Error: Only strings can be converted to symbols

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(all_of(new) := "Sepal.Length")))  
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% rename(`:=`(all_of(new), "Sepal.Length")))`.
#> i The error occured in row 1.

# working, but only with `rename_with()`
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename_with(~ new, "Sepal.Length")))  %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>       a Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   5.1         3.5          1.4         0.2
#> 2   4.9         3            1.4         0.2
#> 3   4.7         3.2          1.3         0.2
#> 4   4.6         3.1          1.5         0.2
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>       b Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   7           3.2          4.7         1.4
#> 2   6.4         3.2          4.5         1.5
#> 3   6.9         3.1          4.9         1.5
#> 4   5.5         2.3          4           1.3
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>       c Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   6.3         3.3          6           2.5
#> 2   5.8         2.7          5.1         1.9
#> 3   7.1         3            5.9         2.1
#> 4   6.3         2.9          5.6         1.8
#> # ... with 46 more rows


# mutate ------
# the values of the new column don't matter
# here we just use the same input as the name, to show that RHS evaluation is easier.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(!! new := new))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris %>% 
  mutate(mydat = list(mydat %>% mutate(!! sym(new) := new))) 
#> Error: Only strings can be converted to symbols

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(all_of(new) := new))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% mutate(`:=`(all_of(new), new)))`.
#> i The error occured in row 1.

# almost working (what's going on in the data[[1]] btw!)
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate("{{new}}" := new)))  %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `promise_fn(3L)`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>           
#> 1          5.1         3.5          1.4         0.2 a               
#> 2          4.9         3            1.4         0.2 a               
#> 3          4.7         3.2          1.3         0.2 a               
#> 4          4.6         3.1          1.5         0.2 a               
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `"b"`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          7           3.2          4.7         1.4 b    
#> 2          6.4         3.2          4.5         1.5 b    
#> 3          6.9         3.1          4.9         1.5 b    
#> 4          5.5         2.3          4           1.3 b    
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `"c"`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          6.3         3.3          6           2.5 c    
#> 2          5.8         2.7          5.1         1.9 c    
#> 3          7.1         3            5.9         2.1 c    
#> 4          6.3         2.9          5.6         1.8 c    
#> # ... with 46 more rows

reprex package (v0.3.0)

于 2020-12-22 创建

您可以使用 quote() 保护您的 !! 免受外部调用,然后在您的嵌套调用中再次使用 !! 取消引用它:

myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(!! quote(!!new) := new))) %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width a    
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
#>  1          5.1         3.5          1.4         0.2 a    
#>  2          4.9         3            1.4         0.2 a    
#>  3          4.7         3.2          1.3         0.2 a    
#>  4          4.6         3.1          1.5         0.2 a    
#>  5          5           3.6          1.4         0.2 a    
#>  6          5.4         3.9          1.7         0.4 a    
#>  7          4.6         3.4          1.4         0.3 a    
#>  8          5           3.4          1.5         0.2 a    
#>  9          4.4         2.9          1.4         0.2 a    
#> 10          4.9         3.1          1.5         0.1 a    
#> # ... with 40 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width b    
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
#>  1          7           3.2          4.7         1.4 b    
#>  2          6.4         3.2          4.5         1.5 b    
#>  3          6.9         3.1          4.9         1.5 b    
#>  4          5.5         2.3          4           1.3 b    
#>  5          6.5         2.8          4.6         1.5 b    
#>  6          5.7         2.8          4.5         1.3 b    
#>  7          6.3         3.3          4.7         1.6 b    
#>  8          4.9         2.4          3.3         1   b    
#>  9          6.6         2.9          4.6         1.3 b    
#> 10          5.2         2.7          3.9         1.4 b    
#> # ... with 40 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width c    
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
#>  1          6.3         3.3          6           2.5 c    
#>  2          5.8         2.7          5.1         1.9 c    
#>  3          7.1         3            5.9         2.1 c    
#>  4          6.3         2.9          5.6         1.8 c    
#>  5          6.5         3            5.8         2.2 c    
#>  6          7.6         3            6.6         2.1 c    
#>  7          4.9         2.5          4.5         1.7 c    
#>  8          7.3         2.9          6.3         1.8 c    
#>  9          6.7         2.5          5.8         1.8 c    
#> 10          7.2         3.6          6.1         2.5 c    
#> # ... with 40 more rows
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! quote(!!new) := "Sepal.Length"))) %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>        a Sepal.Width Petal.Length Petal.Width
#>    <dbl>       <dbl>        <dbl>       <dbl>
#>  1   5.1         3.5          1.4         0.2
#>  2   4.9         3            1.4         0.2
#>  3   4.7         3.2          1.3         0.2
#>  4   4.6         3.1          1.5         0.2
#>  5   5           3.6          1.4         0.2
#>  6   5.4         3.9          1.7         0.4
#>  7   4.6         3.4          1.4         0.3
#>  8   5           3.4          1.5         0.2
#>  9   4.4         2.9          1.4         0.2
#> 10   4.9         3.1          1.5         0.1
#> # ... with 40 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>        b Sepal.Width Petal.Length Petal.Width
#>    <dbl>       <dbl>        <dbl>       <dbl>
#>  1   7           3.2          4.7         1.4
#>  2   6.4         3.2          4.5         1.5
#>  3   6.9         3.1          4.9         1.5
#>  4   5.5         2.3          4           1.3
#>  5   6.5         2.8          4.6         1.5
#>  6   5.7         2.8          4.5         1.3
#>  7   6.3         3.3          4.7         1.6
#>  8   4.9         2.4          3.3         1  
#>  9   6.6         2.9          4.6         1.3
#> 10   5.2         2.7          3.9         1.4
#> # ... with 40 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>        c Sepal.Width Petal.Length Petal.Width
#>    <dbl>       <dbl>        <dbl>       <dbl>
#>  1   6.3         3.3          6           2.5
#>  2   5.8         2.7          5.1         1.9
#>  3   7.1         3            5.9         2.1
#>  4   6.3         2.9          5.6         1.8
#>  5   6.5         3            5.8         2.2
#>  6   7.6         3            6.6         2.1
#>  7   4.9         2.5          4.5         1.7
#>  8   7.3         2.9          6.3         1.8
#>  9   6.7         2.5          5.8         1.8
#> 10   7.2         3.6          6.1         2.5
#> # ... with 40 more rows