如果不是 NA,则通过粘贴第一行来重命名列

Renaming Columns by Pasting First Row if not NA

我希望在我的组织中标准化 Survey Monkey 的清理导出,如果第一行不是 NA,我想将列名重命名为(列名 + 第一行名)。

编辑:这最好在 function/loop 中实现,这样它就可以处理不同大小的数据帧,而无需编辑任何参数。

代表:

df <- tribble(
  ~`Which of these choices do you like`, ~`...1`, ~`...2`, ~`...3`, ~`Respondent ID`, ~`Different Text`, ~`...4`,
  'Fruit', 'Drink', 'Dessert', 'Snack', NA, 'Pizza Topping', 'Pizza Style',
  'Apple', 'Water', 'Pie', 'Oreos', 1234, 'Mushroom', 'Deep Dish',
  'Apple', 'Coffee', 'Cake', 'Granola', 1235, 'Onion', 'NY Style',
  'Banana', 'Coffee', 'Pie', 'Oreos', 1236, 'Mushroom', 'NY Style',
  'Pear', 'Vodka', 'Pie', 'Granola', 1237, 'Onion', 'Deep Dish'
)

列重命名后,我将删除第一行并继续我的生活。

理想情况下,我的 df 应该是这样的:

感谢您的指导!

base R中,我们可以使用paste,然后删除第一行

names(df)[1:4] <- paste0(names(df)[1], unlist(df[1, 1:4]))
df <- df[-1, ]

或使用sprintf

names(df)[1:4] <- sprintf("%s (%s)", names(df)[1], unlist(df[1, 1:4]))
df <- df[-1,]

如果我们想通过检查 NA 元素来做到这一点

library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
keydat <- df %>%
          slice(1) %>% 
          select_if(negate(is.na)) %>%
          pivot_longer(everything()) %>%
          group_by(grp = cumsum(!startsWith(name, "..."))) %>% 
          mutate(value = sprintf("%s (%s)", first(name), value)) %>% 
          ungroup %>% 
          select(-grp)


df <- df %>%
        rename_at(vars(keydat$name), ~ keydat$value) %>%
        slice(-1)

df
# A tibble: 4 x 7
#  `Which of these… `Which of these… `Which of these… `Which of these… `Respondent ID`
#  <chr>            <chr>            <chr>            <chr>                      <dbl>
#1 Apple            Water            Pie              Oreos                       1234
#2 Apple            Coffee           Cake             Granola                     1235
#3 Banana           Coffee           Pie              Oreos                       1236
#4 Pear             Vodka            Pie              Granola                     1237
# … with 2 more variables: `Different Text (Pizza Topping)` <chr>, `Different Text (Pizza
#   Style)` <chr>

names(df)
#[1] "Which of these choices do you like (Fruit)"   "Which of these choices do you like (Drink)"  
#[3] "Which of these choices do you like (Dessert)" "Which of these choices do you like (Snack)"  
#[5] "Respondent ID"                                "Different Text (Pizza Topping)"              
#[7] "Different Text (Pizza Style)"