分类的源变量不存在
Source variable for the categorisation does not exist
我正在尝试从一个新创建的变量创建一个因子变量,该变量计算多个列中的项目数。我目前在最后阶段遇到问题,这是当我尝试对新变量进行分类时,这里 R 告诉我分类的源变量在它存在时不存在。谁能帮我解决这个问题?下面分享了源代码和错误代码。非常感谢。
# download pacman package if not installed, otherwise load it
if (!require(pacman)) install.packages(pacman)
# loads relevant packages using the pacman package
pacman::p_load(
tidyverse,
data.table)
# Create dataset
data <- data.table(A = sample(as.numeric(c(0, 1, 2)), 1300,replace = TRUE),
B = sample(as.numeric(c(0, 0, 2)), 1300,replace = TRUE),
C = sample(as.numeric(c(0, 1, 2)), 1300,replace = TRUE),
D = sample(as.numeric(c(0, 0, 1)), 1300,replace = TRUE),
E = sample(as.numeric(c(0, 1, 1)), 1300,replace = TRUE))
# sum up all of the relevant rows and create new sumVar column
data <- data %>%
rowwise() %>%
mutate(sumVar = sum(c_across((B:E))))
# categorise the newly created sum variable
data <- data %>%
mutate(sum = 9) %>%
.[sumVar == 0, sum := 1] %>%
.[sumVar == 1, sum := 2] %>%
.[sumVar == 2, sum := 3] %>%
.[sumVar >= 3, sum := 4] %>%
.[sum == 9, sum := NA]
上述代码产生的错误....
Error in `[.tbl_df`(., sumVar == 0, `:=`(sum, 1)) :
object 'sumVar' not found
使用 rowwise
后,您的数据不再是 data.table
,而是一个 tibble,因此 data.table
语法将无法处理它。尝试-
library(data.table)
setDT(data)
data %>%
.[sumVar == 0, sum := 1] %>%
.[sumVar == 1, sum := 2] %>%
.[sumVar == 2, sum := 3] %>%
.[sumVar >= 3, sum := 4] %>%
.[sum == 9, sum := NA]
data
# A B C D E sumVar sum
# 1: 2 0 1 1 0 2 3
# 2: 1 2 2 0 0 4 4
# 3: 1 0 2 1 1 4 4
# 4: 0 0 0 0 1 1 2
# 5: 1 2 0 1 1 4 4
# ---
#1296: 1 2 1 0 1 4 4
#1297: 2 0 0 0 0 0 1
#1298: 2 0 2 0 1 3 4
#1299: 0 2 1 0 0 3 4
#1300: 2 2 0 0 1 3 4
我正在尝试从一个新创建的变量创建一个因子变量,该变量计算多个列中的项目数。我目前在最后阶段遇到问题,这是当我尝试对新变量进行分类时,这里 R 告诉我分类的源变量在它存在时不存在。谁能帮我解决这个问题?下面分享了源代码和错误代码。非常感谢。
# download pacman package if not installed, otherwise load it
if (!require(pacman)) install.packages(pacman)
# loads relevant packages using the pacman package
pacman::p_load(
tidyverse,
data.table)
# Create dataset
data <- data.table(A = sample(as.numeric(c(0, 1, 2)), 1300,replace = TRUE),
B = sample(as.numeric(c(0, 0, 2)), 1300,replace = TRUE),
C = sample(as.numeric(c(0, 1, 2)), 1300,replace = TRUE),
D = sample(as.numeric(c(0, 0, 1)), 1300,replace = TRUE),
E = sample(as.numeric(c(0, 1, 1)), 1300,replace = TRUE))
# sum up all of the relevant rows and create new sumVar column
data <- data %>%
rowwise() %>%
mutate(sumVar = sum(c_across((B:E))))
# categorise the newly created sum variable
data <- data %>%
mutate(sum = 9) %>%
.[sumVar == 0, sum := 1] %>%
.[sumVar == 1, sum := 2] %>%
.[sumVar == 2, sum := 3] %>%
.[sumVar >= 3, sum := 4] %>%
.[sum == 9, sum := NA]
上述代码产生的错误....
Error in `[.tbl_df`(., sumVar == 0, `:=`(sum, 1)) :
object 'sumVar' not found
使用 rowwise
后,您的数据不再是 data.table
,而是一个 tibble,因此 data.table
语法将无法处理它。尝试-
library(data.table)
setDT(data)
data %>%
.[sumVar == 0, sum := 1] %>%
.[sumVar == 1, sum := 2] %>%
.[sumVar == 2, sum := 3] %>%
.[sumVar >= 3, sum := 4] %>%
.[sum == 9, sum := NA]
data
# A B C D E sumVar sum
# 1: 2 0 1 1 0 2 3
# 2: 1 2 2 0 0 4 4
# 3: 1 0 2 1 1 4 4
# 4: 0 0 0 0 1 1 2
# 5: 1 2 0 1 1 4 4
# ---
#1296: 1 2 1 0 1 4 4
#1297: 2 0 0 0 0 0 1
#1298: 2 0 2 0 1 3 4
#1299: 0 2 1 0 0 3 4
#1300: 2 2 0 0 1 3 4