使用变量名称向量重新排序多个分类变量的水平

Reorder the levels of multiple categorical variables using a vector of variable names

我有一个包含因子变量的大型数据集,但只想对变量列表的级别重新排序,下面标题为“myvars”。我想重新排序级别,以便以在 table 1 中有意义的方式对它们进行总结。但是,当我尝试更改数据集中整个变量向量级别的顺序时,我保持收到错误:错误:分配的数据 value 必须与现有数据兼容...

示例数据:

donuts <- c("moderately","a lot","a lot","a lot","a little bit")
cookies <- c("a lot","a lot","not at all","moderately","a lot")
cupcakes <- c("not at all","not at all","a lot","moderately","a little bit")
coffee <- c("a little bit","not at all","moderately","a little bit","not at all")
macarons <- c("a little bit","moderately","not at all","not at all","a little bit")
dataset <- as.data.frame(donuts,cookies,cupcakes,coffee,macarons)
myvars <- c("donuts","cookies","cupcakes")

dataset[,myvars] <- factor(dataset[,myvars],levels=c("Not at all","Moderately","A little bit","A lot"))

或者我应该使用循环?非常感谢任何建议,谢谢!

使用lapply更改多列中的因子水平。还要确保因子水平与数据中的相同,否则它将 return NA。在您的尝试中,您使用的是混合大小写,而在您的数据中它只是小写。

dataset[, myvars] <- lapply(dataset[, myvars], factor, 
                      levels=c("not at all","moderately","a little bit","a lot"))

使用dplyr

library(dplyr)
dataset %>%
  mutate(across(myvars, factor, 
               levels=c("not at all","moderately","a little bit","a lot")))
  #In older version of dplyr use mutate_at
  #mutate_at(vars(myvars), factor, 
               levels=c("not at all","moderately","a little bit","a lot"))

数据

dataset <- data.frame(donuts,cookies,cupcakes,coffee,macarons)

您缺少 unlist 函数,即:

dataset[,myvars] <- factor(unlist(dataset[,myvars]),
                levels=c("Not at all","Moderately","A little bit","A lot"))