如何重新编码序数变量?
How to recode ordinal variable?
我正在使用来自世界价值观调查的调查数据,我使用下面的代码将我的变量从数字变量更改为有序变量
renameddata$Education= ordered(renameddata$Education, levels =c(-2,-1,840001,840002,840003,
840004,840005,840006,840007,
840008,840009),
labels = c("NA","NA","LessHighSchool","SomeHighSchool",
"GED","SomeCollege","Associates","Bachelors",
"Masters","Professional","Doctorate"))
但是,现在我想重新编码教育变量,以便 LessHighSchool
和 SomeHighSchool
合二为一,例如 "NO GED"
,这样 SomeCollege
、Associates
和 Bachelors
变成 "Undergraduate"
等等
这个怎么样:
library(dplyr)
renameddat <- renameddat %>% mutate(Education =
case_when(
Education %in% c(840001,840002) ~ "No GED",
Education == 840003 ~ "GED",
Education %in% c(840004,840005,840006) ~ "Undergraduate",
Education %in% c(840007,840008,840009) ~ "Graduate",
TRUE ~ NA_character_),
Education=factor(Education,
levels=c("No GED", "GED", "Undergraduate", "Graduate")))
或者,如果您想重新编码创建的因子变量,您可以使用 forcats
包中的 fct_collapse
:
输入:
renameddata <- data.frame(Education = c(-2, -1, 840001, 840002, 840003, 840004, 840005, 840006, 840007, 840008, 840009))
renameddata$Education = ordered(renameddata$Education,
levels = c(-2, -1, 840001, 840002, 840003, 840004, 840005, 840006, 840007, 840008, 840009),
labels = c("NA", "NA", "LessHighSchool", "SomeHighSchool", "GED", "SomeCollege", "Associates", "Bachelors", "Masters", "Professional", "Doctorate"))
重新编码:
library(forcats)
renameddata$Education <- fct_collapse(renameddata$Education,
"NO GED" = c("LessHighSchool", "SomeHighSchool"),
"Undergraduate" = c("SomeCollege", "Associates", "Bachelors"))
给出:
Education
1 NA
2 NA
3 NO GED
4 NO GED
5 GED
6 Undergraduate
7 Undergraduate
8 Undergraduate
9 Masters
10 Professional
11 Doctorate
我正在使用来自世界价值观调查的调查数据,我使用下面的代码将我的变量从数字变量更改为有序变量
renameddata$Education= ordered(renameddata$Education, levels =c(-2,-1,840001,840002,840003,
840004,840005,840006,840007,
840008,840009),
labels = c("NA","NA","LessHighSchool","SomeHighSchool",
"GED","SomeCollege","Associates","Bachelors",
"Masters","Professional","Doctorate"))
但是,现在我想重新编码教育变量,以便 LessHighSchool
和 SomeHighSchool
合二为一,例如 "NO GED"
,这样 SomeCollege
、Associates
和 Bachelors
变成 "Undergraduate"
等等
这个怎么样:
library(dplyr)
renameddat <- renameddat %>% mutate(Education =
case_when(
Education %in% c(840001,840002) ~ "No GED",
Education == 840003 ~ "GED",
Education %in% c(840004,840005,840006) ~ "Undergraduate",
Education %in% c(840007,840008,840009) ~ "Graduate",
TRUE ~ NA_character_),
Education=factor(Education,
levels=c("No GED", "GED", "Undergraduate", "Graduate")))
或者,如果您想重新编码创建的因子变量,您可以使用 forcats
包中的 fct_collapse
:
输入:
renameddata <- data.frame(Education = c(-2, -1, 840001, 840002, 840003, 840004, 840005, 840006, 840007, 840008, 840009))
renameddata$Education = ordered(renameddata$Education,
levels = c(-2, -1, 840001, 840002, 840003, 840004, 840005, 840006, 840007, 840008, 840009),
labels = c("NA", "NA", "LessHighSchool", "SomeHighSchool", "GED", "SomeCollege", "Associates", "Bachelors", "Masters", "Professional", "Doctorate"))
重新编码:
library(forcats)
renameddata$Education <- fct_collapse(renameddata$Education,
"NO GED" = c("LessHighSchool", "SomeHighSchool"),
"Undergraduate" = c("SomeCollege", "Associates", "Bachelors"))
给出:
Education
1 NA
2 NA
3 NO GED
4 NO GED
5 GED
6 Undergraduate
7 Undergraduate
8 Undergraduate
9 Masters
10 Professional
11 Doctorate