在 R 中使用 onehot 库时,model.matrix 命令出现错误
While using onehot library in R, I get an error in the model.matrix command
对于标签编码,我使用 model.matrix
来自 R 中的库 onehot
。
数据集可用here.
我已将文件重命名为 train.csv
要编码的特征是Education
。它有两个级别,Graduate
和 Not Graduate
。但是在执行代码时,
library(onehot)
data <- read_csv("train.csv")
set.seed(1234)
datashuffled <- data[sample(1:nrow(data)), ]
datashuffled_Loan_StatusRemoved <- datashuffled %>%
select(-starts_with("Loan_Status"))
features <- datashuffled_Loan_StatusRemoved
sum(is.na(features$Education))
features$Education[features$Education=="Not Graduate"] <- "NotGraduate"
E <- model.matrix(~Education-1,head(features))
我得到一个错误
Error in contrasts<-(tmp, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels.
抱歉打错了。我应该使用 model.matrix
的完整数据集。解决方法是替换
E <- model.matrix(~Education-1,head(features))
至
E <- model.matrix(~Education-1,features)
对于标签编码,我使用 model.matrix
来自 R 中的库 onehot
。
数据集可用here.
我已将文件重命名为 train.csv
要编码的特征是Education
。它有两个级别,Graduate
和 Not Graduate
。但是在执行代码时,
library(onehot)
data <- read_csv("train.csv")
set.seed(1234)
datashuffled <- data[sample(1:nrow(data)), ]
datashuffled_Loan_StatusRemoved <- datashuffled %>%
select(-starts_with("Loan_Status"))
features <- datashuffled_Loan_StatusRemoved
sum(is.na(features$Education))
features$Education[features$Education=="Not Graduate"] <- "NotGraduate"
E <- model.matrix(~Education-1,head(features))
我得到一个错误
Error in contrasts<-(tmp, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels.
抱歉打错了。我应该使用 model.matrix
的完整数据集。解决方法是替换
E <- model.matrix(~Education-1,head(features))
至
E <- model.matrix(~Education-1,features)