更正列中文本类型数据的最有效方法是什么?
What is the most efficient way to correct text type data in a column?
fito <- c("forest", "savaaaana", "brae soil", "bare soil", "savanna", "froest")
id <- 1:6
df <- data.frame(fito = as.factor(fito), id = id)
用正确的数据 ("savanna", "bare soil", "forest")
替换错误输入的数据 ("savaaaana", "brae soil", "froest")
的最聪明的方法是什么?
一开始我有六个因素。正确的是只有三个。
如何使用 tidyverse 包执行此操作?
你可以试试:
df2 <- df %>% mutate(fito = fct_collapse(fito, savanna = c("savaaaana", "savanna"),
`bare soil` = c("brae soil","bare soil"),
forest = c("forest","froest" )))
str(df2)
'data.frame': 6 obs. of 2 variables:
$ fito: Factor w/ 3 levels "bare soil","forest",..: 2 3 1 1 3 2
$ id : int 1 2 3 4 5 6
两种方法:
library(tidyverse)
old<- c("savaaaana", "brae soil", "froest")
new<- c("savanna", "bare soil", "forest")
df %>%
mutate(fito=factor(str_replace_all(fito, set_names(new, old))))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
df %>%
mutate(fito = lift(fct_recode)(as.list(set_names(old, new)), fito))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
df %>%
mutate(fito = invoke(fct_recode, c(list(fito),as.list(set_names(old, new)))))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
fito <- c("forest", "savaaaana", "brae soil", "bare soil", "savanna", "froest")
id <- 1:6
df <- data.frame(fito = as.factor(fito), id = id)
用正确的数据 ("savanna", "bare soil", "forest")
替换错误输入的数据 ("savaaaana", "brae soil", "froest")
的最聪明的方法是什么?
一开始我有六个因素。正确的是只有三个。
如何使用 tidyverse 包执行此操作?
你可以试试:
df2 <- df %>% mutate(fito = fct_collapse(fito, savanna = c("savaaaana", "savanna"),
`bare soil` = c("brae soil","bare soil"),
forest = c("forest","froest" )))
str(df2)
'data.frame': 6 obs. of 2 variables:
$ fito: Factor w/ 3 levels "bare soil","forest",..: 2 3 1 1 3 2
$ id : int 1 2 3 4 5 6
两种方法:
library(tidyverse)
old<- c("savaaaana", "brae soil", "froest")
new<- c("savanna", "bare soil", "forest")
df %>%
mutate(fito=factor(str_replace_all(fito, set_names(new, old))))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
df %>%
mutate(fito = lift(fct_recode)(as.list(set_names(old, new)), fito))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
df %>%
mutate(fito = invoke(fct_recode, c(list(fito),as.list(set_names(old, new)))))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6