根据部分字符串或子字符串更改列名

Question

我有一个数据框df。我可以为 5 个不同的变量生成这个数据框 5 次。假设变量名称是：

Apple  # apple_df
Mango  # mango_df
Banana # banana_df
Potato # potato_df
Tomato # tomato_df

每次生成数据框时，其中一个列名称非常大，例如：

Apple - Growth Level Judgement    # Column name for apple_df
Mango - Growth Level Judgement    # Column name for mango_df
Banana - Growth Level Judgement   # Column name for banana_df
Potato - Growth Level Judgement   # Column name for potato_df
Tomato - Growth Level Judgement   # Column name for tomato_df

我想在每个文件中将上面的列名称更改为单词 Growth。

有没有一种方法可以通过使用一个公共代码行（单独）在所有数据帧中有效地做到这一点？

我可以在每个文件中分别使用完整的名称，但想知道我们是否可以有一个通用的解决方案：

# For Apple data frame

# Update column name
setnames(apple_df, 
         old = c('Apple - Growth Level Judgement'), 
         new = c('Growth'))

如果我使用以下基于正则表达式的解决方案，它只会替换所有数据帧中通用的字符串名称部分。不幸的是，不是全名。

gsub(x = names(apple_df), 
     pattern = "Growth Level Judgement$", replacement = "Growth")

相关posts:

下面的 post 是相关的，但它去掉了字符串的已知部分。在我的例子中，我想检测基于在多个数据集中保持相同的部分字符串的列的出现。但是一旦在列名中检测到字符串，我想更改整个列名。以下post可能也有关系但不符合我的需求 or

如有任何建议，我们将不胜感激。谢谢！

Answer 1

将数据帧放在一个列表中并使用lapply/map更改每个数据帧的名称。 list2env 将这些更改从列表传输到单个数据帧。

library(dplyr)
library(purrr)

list_df <- lst(Apple, Mango, Banana, Potato, Tomato)

list_df <- map(list_df, 
             ~.x %>% rename_with(~'Growth', matches('Growth Level Judgement')))

list2env(list_df, .GlobalEnv)

要运行它在单个数据帧上你可以做-

Apple %>% rename_with(~'Growth', matches('Growth Level Judgement')))

或以 R 为基数 -

names(Apple)[grep('Growth Level Judgement', names(Apple))] <- 'Growth'

Answer 2

替代解决方案可以是：

Apple %>% 
      rename_with(~'Growth', ends_with('Growth Level Judgement'))

Answer 3

使用 base R

中的 endsWith

names(Apple)[endsWith(names(Apple), 'Growth Level Judgement')] <- 'Growth'

根据文档 ?endsWith，它可能会更快

startsWith() is equivalent to but much faster than

substring(x, 1, nchar(prefix)) == prefix
or also

grepl("^", x)

根据部分字符串或子字符串更改列名

Changing the column name based on a partial string or substring

regex

string

substring

r

rename

相关posts: