根据不同的列在列中重新编码观察
Recode observation in column depending on different column
我有一个名为 'survey' 的数据集,其中包含行的个人 ID,以及包含许多问题的列。我需要将 1 列中的值重新编码为 NA
并将观察值移至另一列。
例如:
ID Food Vegetable
aaa NA NA
bbb NA lemon
ccc NA sprout
ddd fruit NA
eee fruit NA
fff NA watermelon
我想更改属于 ID bbb
和 fff
的 lemon
和 watermelon
观察值,将它们放入 Food
列并重命名他们 fruit
(调查受访者将他们放在错误的列中)并在 vegetable
列中留下 NA
。
看起来像:
ID Food Vegetable
aaa NA NA
bbb fruit NA
ccc NA sprout
ddd fruit NA
eee fruit NA
fff fruit NA
我用过:
survey<- survey %>%
mutate(food = if_else(str_detect(Vegetable,"(lemon)|(watermelon)"),"fruit", Food))
可以将 NA
转换为 food
列中的 fruit
,但它与 vegetable
列中的 NA
不一致,它还将 food
列中的所有其他水果变为 NA
!
数据:
structure(list(ID = c("aaa", "bbb", "ccc", "ddd", "eee", "fff"
), Food = c(NA, NA, NA, "fruit", "fruit", NA), Vegetable = c(NA,
"lemon", "sprout", NA, NA, "watermelon")), class = "data.frame", row.names = c(NA,
-6L))
P.S.:这是对 的后续回答。这与以前的问题不完全相同,这就是我开始新问题的原因。
dplyr 版本 (1.0.2)
一个选项是根据 Vegetable
值是否 %in%
给定列表更新 Food
和 Vegetable
,not_vegetables
:
not_vegetables <- c("grape", "tomato")
df %>%
mutate(Food = if_else(Vegetable %in% not_vegetables, "fruit", Food),
Vegetable = if_else(Vegetable %in% not_vegetables, NA_character_, Vegetable))
另一种方法是replace
、across
两列,并在里面做if_else
:
df %>%
mutate(across(
c(Food, Vegetable),
~replace(.,
Vegetable %in% not_vegetables,
if_else(cur_column() == "Food", 'fruit', NA_character_))
))
使用 base R 你能试试这个吗:
#Conditional
values <- c('grape','tomato')
df$Food <- ifelse(df$Vegetable %in% values,'fruit',df$Food)
df$Vegetable <- ifelse(df$Vegetable %in% values,NA,df$Vegetable)
输出:
df
ID Food Vegetable
1 aaa fruit <NA>
2 bbb fruit <NA>
3 ccc fruit <NA>
4 ddd fruit <NA>
数据
df <- structure(list(ID = c("aaa", "bbb", "ccc", "ddd"), Food = c(NA,
NA, "fruit", "fruit"), Vegetable = c("grape", "tomato", NA, NA
)), class = "data.frame", row.names = c(NA, -4L))
我有一个名为 'survey' 的数据集,其中包含行的个人 ID,以及包含许多问题的列。我需要将 1 列中的值重新编码为 NA
并将观察值移至另一列。
例如:
ID Food Vegetable
aaa NA NA
bbb NA lemon
ccc NA sprout
ddd fruit NA
eee fruit NA
fff NA watermelon
我想更改属于 ID bbb
和 fff
的 lemon
和 watermelon
观察值,将它们放入 Food
列并重命名他们 fruit
(调查受访者将他们放在错误的列中)并在 vegetable
列中留下 NA
。
看起来像:
ID Food Vegetable
aaa NA NA
bbb fruit NA
ccc NA sprout
ddd fruit NA
eee fruit NA
fff fruit NA
我用过:
survey<- survey %>%
mutate(food = if_else(str_detect(Vegetable,"(lemon)|(watermelon)"),"fruit", Food))
可以将 NA
转换为 food
列中的 fruit
,但它与 vegetable
列中的 NA
不一致,它还将 food
列中的所有其他水果变为 NA
!
数据:
structure(list(ID = c("aaa", "bbb", "ccc", "ddd", "eee", "fff"
), Food = c(NA, NA, NA, "fruit", "fruit", NA), Vegetable = c(NA,
"lemon", "sprout", NA, NA, "watermelon")), class = "data.frame", row.names = c(NA,
-6L))
P.S.:这是对
dplyr 版本 (1.0.2)
一个选项是根据 Vegetable
值是否 %in%
给定列表更新 Food
和 Vegetable
,not_vegetables
:
not_vegetables <- c("grape", "tomato")
df %>%
mutate(Food = if_else(Vegetable %in% not_vegetables, "fruit", Food),
Vegetable = if_else(Vegetable %in% not_vegetables, NA_character_, Vegetable))
另一种方法是replace
、across
两列,并在里面做if_else
:
df %>%
mutate(across(
c(Food, Vegetable),
~replace(.,
Vegetable %in% not_vegetables,
if_else(cur_column() == "Food", 'fruit', NA_character_))
))
使用 base R 你能试试这个吗:
#Conditional
values <- c('grape','tomato')
df$Food <- ifelse(df$Vegetable %in% values,'fruit',df$Food)
df$Vegetable <- ifelse(df$Vegetable %in% values,NA,df$Vegetable)
输出:
df
ID Food Vegetable
1 aaa fruit <NA>
2 bbb fruit <NA>
3 ccc fruit <NA>
4 ddd fruit <NA>
数据
df <- structure(list(ID = c("aaa", "bbb", "ccc", "ddd"), Food = c(NA,
NA, "fruit", "fruit"), Vegetable = c("grape", "tomato", NA, NA
)), class = "data.frame", row.names = c(NA, -4L))