如果不是 NA,则通过粘贴第一行来重命名列
Renaming Columns by Pasting First Row if not NA
我希望在我的组织中标准化 Survey Monkey 的清理导出,如果第一行不是 NA,我想将列名重命名为(列名 + 第一行名)。
编辑:这最好在 function/loop 中实现,这样它就可以处理不同大小的数据帧,而无需编辑任何参数。
代表:
df <- tribble(
~`Which of these choices do you like`, ~`...1`, ~`...2`, ~`...3`, ~`Respondent ID`, ~`Different Text`, ~`...4`,
'Fruit', 'Drink', 'Dessert', 'Snack', NA, 'Pizza Topping', 'Pizza Style',
'Apple', 'Water', 'Pie', 'Oreos', 1234, 'Mushroom', 'Deep Dish',
'Apple', 'Coffee', 'Cake', 'Granola', 1235, 'Onion', 'NY Style',
'Banana', 'Coffee', 'Pie', 'Oreos', 1236, 'Mushroom', 'NY Style',
'Pear', 'Vodka', 'Pie', 'Granola', 1237, 'Onion', 'Deep Dish'
)
列重命名后,我将删除第一行并继续我的生活。
理想情况下,我的 df 应该是这样的:
感谢您的指导!
在base R
中,我们可以使用paste
,然后删除第一行
names(df)[1:4] <- paste0(names(df)[1], unlist(df[1, 1:4]))
df <- df[-1, ]
或使用sprintf
names(df)[1:4] <- sprintf("%s (%s)", names(df)[1], unlist(df[1, 1:4]))
df <- df[-1,]
如果我们想通过检查 NA 元素来做到这一点
library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
keydat <- df %>%
slice(1) %>%
select_if(negate(is.na)) %>%
pivot_longer(everything()) %>%
group_by(grp = cumsum(!startsWith(name, "..."))) %>%
mutate(value = sprintf("%s (%s)", first(name), value)) %>%
ungroup %>%
select(-grp)
df <- df %>%
rename_at(vars(keydat$name), ~ keydat$value) %>%
slice(-1)
df
# A tibble: 4 x 7
# `Which of these… `Which of these… `Which of these… `Which of these… `Respondent ID`
# <chr> <chr> <chr> <chr> <dbl>
#1 Apple Water Pie Oreos 1234
#2 Apple Coffee Cake Granola 1235
#3 Banana Coffee Pie Oreos 1236
#4 Pear Vodka Pie Granola 1237
# … with 2 more variables: `Different Text (Pizza Topping)` <chr>, `Different Text (Pizza
# Style)` <chr>
names(df)
#[1] "Which of these choices do you like (Fruit)" "Which of these choices do you like (Drink)"
#[3] "Which of these choices do you like (Dessert)" "Which of these choices do you like (Snack)"
#[5] "Respondent ID" "Different Text (Pizza Topping)"
#[7] "Different Text (Pizza Style)"
我希望在我的组织中标准化 Survey Monkey 的清理导出,如果第一行不是 NA,我想将列名重命名为(列名 + 第一行名)。
编辑:这最好在 function/loop 中实现,这样它就可以处理不同大小的数据帧,而无需编辑任何参数。
代表:
df <- tribble(
~`Which of these choices do you like`, ~`...1`, ~`...2`, ~`...3`, ~`Respondent ID`, ~`Different Text`, ~`...4`,
'Fruit', 'Drink', 'Dessert', 'Snack', NA, 'Pizza Topping', 'Pizza Style',
'Apple', 'Water', 'Pie', 'Oreos', 1234, 'Mushroom', 'Deep Dish',
'Apple', 'Coffee', 'Cake', 'Granola', 1235, 'Onion', 'NY Style',
'Banana', 'Coffee', 'Pie', 'Oreos', 1236, 'Mushroom', 'NY Style',
'Pear', 'Vodka', 'Pie', 'Granola', 1237, 'Onion', 'Deep Dish'
)
列重命名后,我将删除第一行并继续我的生活。
理想情况下,我的 df 应该是这样的:
感谢您的指导!
在base R
中,我们可以使用paste
,然后删除第一行
names(df)[1:4] <- paste0(names(df)[1], unlist(df[1, 1:4]))
df <- df[-1, ]
或使用sprintf
names(df)[1:4] <- sprintf("%s (%s)", names(df)[1], unlist(df[1, 1:4]))
df <- df[-1,]
如果我们想通过检查 NA 元素来做到这一点
library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
keydat <- df %>%
slice(1) %>%
select_if(negate(is.na)) %>%
pivot_longer(everything()) %>%
group_by(grp = cumsum(!startsWith(name, "..."))) %>%
mutate(value = sprintf("%s (%s)", first(name), value)) %>%
ungroup %>%
select(-grp)
df <- df %>%
rename_at(vars(keydat$name), ~ keydat$value) %>%
slice(-1)
df
# A tibble: 4 x 7
# `Which of these… `Which of these… `Which of these… `Which of these… `Respondent ID`
# <chr> <chr> <chr> <chr> <dbl>
#1 Apple Water Pie Oreos 1234
#2 Apple Coffee Cake Granola 1235
#3 Banana Coffee Pie Oreos 1236
#4 Pear Vodka Pie Granola 1237
# … with 2 more variables: `Different Text (Pizza Topping)` <chr>, `Different Text (Pizza
# Style)` <chr>
names(df)
#[1] "Which of these choices do you like (Fruit)" "Which of these choices do you like (Drink)"
#[3] "Which of these choices do you like (Dessert)" "Which of these choices do you like (Snack)"
#[5] "Respondent ID" "Different Text (Pizza Topping)"
#[7] "Different Text (Pizza Style)"