Dplyr：从字符向量中添加 mutate/across 的多列

Question

我想使用 dplyr 添加几列（用 NA 填充）到 data.frame。我已经在字符向量中定义了列的名称。通常，只有一个新列，您可以使用以下模式：

test %>% 
  mutate(!!new_column := NA)

但是，我无法使用它 across:

library(dplyr)

test <- data.frame(a = 1:3)
add_cols <- c("col_1", "col_2")

test %>% 
  mutate(across(!!add_cols, ~ NA))
#> Error: Problem with `mutate()` input `..1`.
#> x Can't subset columns that don't exist.
#> x Columns `col_1` and `col_2` don't exist.
#> ℹ Input `..1` is `across(c("col_1", "col_2"), ~NA)`.

test %>% 
  mutate(!!add_cols := NA)
#> Error: The LHS of `:=` must be a string or a symbol

expected_output <- data.frame(
  a = 1:3,
  col_1 = rep(NA, 3),
  col_2 = rep(NA, 3)
)
expected_output
#>   a col_1 col_2
#> 1 1    NA    NA
#> 2 2    NA    NA
#> 3 3    NA    NA

^{由 reprex package (v1.0.0)}

于 2021-10-05 创建

使用第一种方法，可以正确创建列名，但它会直接尝试在现有列名中找到它。在第二种方法中，我只能使用单个字符串。

是否有 tidyverse 解决方案，或者我是否需要求助于旧的 for 循环？

Answer 1

!! 适用于单个元素

for(nm in add_cols) test <- test %>% 
         mutate(!! nm := NA)

-输出

> test
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

或者另一种选择是

test %>% 
   bind_cols(setNames(rep(list(NA), length(add_cols)), add_cols))
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

在 base R 中，这更容易

test[add_cols] <- NA

可以在管道中使用

test %>%
  `[<-`(., add_cols, value = NA)
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

across 仅在列已经存在时才有效，即它建议循环 across 数据中存在的列并使用 [=20= 做一些 modification/create 新列]修改

我们可以利用 tibble

中的 add_column

library(tibble)
library(janitor)
add_column(test, !!! add_cols) %>% 
   clean_names %>% 
   mutate(across(all_of(add_cols), ~ NA))
  a col_1 col_2
1 1    NA    NA
2 2    NA    NA
3 3    NA    NA

Answer 2

另一种方法：

library(tidyverse)
f <- function(x) df$x = NA
mutate(test, map_dfc(add_cols,~ f(.x)))

Dplyr：从字符向量中添加 mutate/across 的多列

Dplyr: add multiple columns with mutate/across from character vector

r

dplyr

across