您如何一次而不是单独订购所有因子水平
how can you order factor levels all at once instead of separately
我正在对一项调查进行分析,大多数问题(167 个问题中的 105 个)的排名在 1 到 10 之间,未填写时为 99999。我加载了数据集进入 R 并用这 105 个问题制作了一个数据框。当我这样做时,我发现数据类型不正确。他们都是双胞胎。所以我首先更改了数据类型 (data set = survey):
survey <-data.frame(lapply(survey, as.character), stringsAsFactors=FALSE)
survey[survey == 99999] <- "No answer"
为了能够将 99999 更改为 "no answer" 然后我使用了:
survey[] <- lapply(survey,factor)
将其更改为因数。但现在的问题是,在我将更改应用到 char 后,因素或等级的顺序立即发生了变化。我认为这样做的原因是,对于某些问题,没有人排名 1,当您将其更改为 char 时,它会将 rank = 10 放在第一位,例如:
survey %>% group_by(v2_a)%>% summarize(count = n())
我知道一种单独重新排序级别的方法,例如:
survey$v2_a <- factor(survey$v2_a, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
survey$v2_b <- factor(survey$v2_b, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
survey$v2_c <- factor(survey$v2_c, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
...
但是,如果您必须针对 105 个不同的问题进行此操作,则需要大量工作。有人知道更短的方法吗?我试过类似的东西:
survey <- factor(survey, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
但这肯定不行。
提供给 lapply
的任何附加参数都将添加到函数参数中,所以像这样
survey[] <- lapply(survey,factor,levels=c(1:10,"no answer"))
可能会起作用。
如果您想更明确地了解它,您可以这样做:
ffun <- function(x) return(factor(x,levels=c(1:10,"no answer")))
survey[] <- lapply(survey,ffun)
您也可以尝试首先使用 na.strings="9999"
(或其他)读取您的数据,以便您的无答案案例自动转换为 NA
。
我正在对一项调查进行分析,大多数问题(167 个问题中的 105 个)的排名在 1 到 10 之间,未填写时为 99999。我加载了数据集进入 R 并用这 105 个问题制作了一个数据框。当我这样做时,我发现数据类型不正确。他们都是双胞胎。所以我首先更改了数据类型 (data set = survey):
survey <-data.frame(lapply(survey, as.character), stringsAsFactors=FALSE)
survey[survey == 99999] <- "No answer"
为了能够将 99999 更改为 "no answer" 然后我使用了:
survey[] <- lapply(survey,factor)
将其更改为因数。但现在的问题是,在我将更改应用到 char 后,因素或等级的顺序立即发生了变化。我认为这样做的原因是,对于某些问题,没有人排名 1,当您将其更改为 char 时,它会将 rank = 10 放在第一位,例如:
survey %>% group_by(v2_a)%>% summarize(count = n())
我知道一种单独重新排序级别的方法,例如:
survey$v2_a <- factor(survey$v2_a, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
survey$v2_b <- factor(survey$v2_b, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
survey$v2_c <- factor(survey$v2_c, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
...
但是,如果您必须针对 105 个不同的问题进行此操作,则需要大量工作。有人知道更短的方法吗?我试过类似的东西:
survey <- factor(survey, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
但这肯定不行。
提供给 lapply
的任何附加参数都将添加到函数参数中,所以像这样
survey[] <- lapply(survey,factor,levels=c(1:10,"no answer"))
可能会起作用。
如果您想更明确地了解它,您可以这样做:
ffun <- function(x) return(factor(x,levels=c(1:10,"no answer")))
survey[] <- lapply(survey,ffun)
您也可以尝试首先使用 na.strings="9999"
(或其他)读取您的数据,以便您的无答案案例自动转换为 NA
。