分类的源变量不存在

Source variable for the categorisation does not exist

我正在尝试从一个新创建的变量创建一个因子变量,该变量计算多个列中的项目数。我目前在最后阶段遇到问题,这是当我尝试对新变量进行分类时,这里 R 告诉我分类的源变量在它存在时不存在。谁能帮我解决这个问题?下面分享了源代码和错误代码。非常感谢。

# download pacman package if not installed, otherwise load it
if (!require(pacman)) install.packages(pacman)

# loads relevant packages using the pacman package
pacman::p_load(
  tidyverse,  
  data.table) 


# Create dataset
data <- data.table(A = sample(as.numeric(c(0, 1, 2)), 1300,replace = TRUE),
                   B = sample(as.numeric(c(0, 0, 2)), 1300,replace = TRUE),
                   C = sample(as.numeric(c(0, 1, 2)), 1300,replace = TRUE),
                   D = sample(as.numeric(c(0, 0, 1)), 1300,replace = TRUE),
                   E = sample(as.numeric(c(0, 1, 1)), 1300,replace = TRUE))  


# sum up all of the relevant rows and create new sumVar column
data <- data %>% 
  rowwise() %>% 
  mutate(sumVar = sum(c_across((B:E)))) 

# categorise the newly created sum variable
data <- data %>% 
  mutate(sum = 9) %>% 
  .[sumVar == 0, sum := 1] %>% 
  .[sumVar == 1, sum := 2] %>% 
  .[sumVar == 2, sum := 3] %>% 
  .[sumVar >= 3, sum := 4] %>% 
  .[sum == 9, sum := NA] 

上述代码产生的错误....

Error in `[.tbl_df`(., sumVar == 0, `:=`(sum, 1)) : 
  object 'sumVar' not found

使用 rowwise 后,您的数据不再是 data.table,而是一个 tibble,因此 data.table 语法将无法处理它。尝试-

library(data.table)

setDT(data)

data %>%
  .[sumVar == 0, sum := 1] %>% 
  .[sumVar == 1, sum := 2] %>% 
  .[sumVar == 2, sum := 3] %>% 
  .[sumVar >= 3, sum := 4] %>% 
  .[sum == 9, sum := NA] 

data

#      A B C D E sumVar sum
#   1: 2 0 1 1 0      2   3
#   2: 1 2 2 0 0      4   4
#   3: 1 0 2 1 1      4   4
#   4: 0 0 0 0 1      1   2
#   5: 1 2 0 1 1      4   4
#  ---                     
#1296: 1 2 1 0 1      4   4
#1297: 2 0 0 0 0      0   1
#1298: 2 0 2 0 1      3   4
#1299: 0 2 1 0 0      3   4
#1300: 2 2 0 0 1      3   4