R: Confusion matrix in RF model returns error: data` and `reference` should be factors with the same levels
R: Confusion matrix in RF model returns error: data` and `reference` should be factors with the same levels
我是 R 的新手,想解决二进制 class化任务。
数据集的因子变量 LABELS 有 2 个 classes:第一个 - 0,第二个 - 1。下图显示了它的实际头部:
TimeDate 列 - 它只是索引。
Class 分布定义为:
print("the number of values with % in factor variable - LABELS:")
percentage <- prop.table(table(dataset$LABELS)) * 100
cbind(freq=table(dataset$LABELS), percentage=percentage)
class 分配结果:
另外我知道Slot2列是根据公式计算的:
Slot2 = Var3 - Slot3 + Slot4
分析相关矩阵后选择特征Var1,Var2,Var3,Var4。
在开始建模之前,我将数据集划分为训练和测试部分。
我尝试为二进制 classification 任务构建随机森林模型,使用下一个代码:
rf2 <- randomForest(LABELS ~ Var1 + Var2 + Var3 + Var4,
data=train, ntree = 100,
mtry = 4, importance = TRUE)
print(rf2)
结果是:
Call:
randomForest(formula = LABELS ~ Var1 + Var2 + Var3 + Var4,
data = train, ntree = 100, mtry = 4, importance = TRUE)
Type of random forest: classification
Number of trees: 100
No. of variables tried at each split: 4
OOB estimate of error rate: 0.16%
Confusion matrix:
0 1 class.error
0 164957 341 0.002062941
1 280 233739 0.001196484
当我尝试做预测时:
# Prediction & Confusion Matrix - train data
p1 <- predict(rf2, train, type="prob")
print("Prediction & Confusion Matrix - train data")
confusionMatrix(p1, train$LABELS)
# # Prediction & Confusion Matrix - test data
p2 <- predict(rf2, test, type="prob")
print("Prediction & Confusion Matrix - test data")
confusionMatrix(p2, test$LABELS)
我在 R 中收到一个错误:
[1] "Prediction & Confusion Matrix - train data"
Error: `data` and `reference` should be factors with the same levels.
Traceback:
1. confusionMatrix(p1, train$LABELS)
2. confusionMatrix.default(p1, train$LABELS)
3. stop("`data` and `reference` should be factors with the same levels.",
. call. = FALSE)
此外,我已经尝试使用来自以下问题的想法来修复它:
Error in ConfusionMatrix the data and reference factors must have the same number of levels R CARET
Error in Confusion Matrix : the data and reference factors must have the same number of levels
但这对我的情况没有帮助。
你能帮我解决这个错误吗?
如果有任何想法,我将不胜感激,并提前 comments.Thank 你。
R 中的一个错误:
Error: `data` and `reference` should be factors with the same levels.
已通过更改 predict 函数中的 type 参数修复,正确代码:
# Prediction & Confusion Matrix - train data
p1 <- predict(rf2, train, type="response")
print("Prediction & Confusion Matrix - train data")
confusionMatrix(p1, train$LABELS)
@Camille,非常感谢)
我是 R 的新手,想解决二进制 class化任务。
数据集的因子变量 LABELS 有 2 个 classes:第一个 - 0,第二个 - 1。下图显示了它的实际头部:
print("the number of values with % in factor variable - LABELS:")
percentage <- prop.table(table(dataset$LABELS)) * 100
cbind(freq=table(dataset$LABELS), percentage=percentage)
class 分配结果:
另外我知道Slot2列是根据公式计算的:
Slot2 = Var3 - Slot3 + Slot4
分析相关矩阵后选择特征Var1,Var2,Var3,Var4。
在开始建模之前,我将数据集划分为训练和测试部分。 我尝试为二进制 classification 任务构建随机森林模型,使用下一个代码:
rf2 <- randomForest(LABELS ~ Var1 + Var2 + Var3 + Var4,
data=train, ntree = 100,
mtry = 4, importance = TRUE)
print(rf2)
结果是:
Call:
randomForest(formula = LABELS ~ Var1 + Var2 + Var3 + Var4,
data = train, ntree = 100, mtry = 4, importance = TRUE)
Type of random forest: classification
Number of trees: 100
No. of variables tried at each split: 4
OOB estimate of error rate: 0.16%
Confusion matrix:
0 1 class.error
0 164957 341 0.002062941
1 280 233739 0.001196484
当我尝试做预测时:
# Prediction & Confusion Matrix - train data
p1 <- predict(rf2, train, type="prob")
print("Prediction & Confusion Matrix - train data")
confusionMatrix(p1, train$LABELS)
# # Prediction & Confusion Matrix - test data
p2 <- predict(rf2, test, type="prob")
print("Prediction & Confusion Matrix - test data")
confusionMatrix(p2, test$LABELS)
我在 R 中收到一个错误:
[1] "Prediction & Confusion Matrix - train data"
Error: `data` and `reference` should be factors with the same levels.
Traceback:
1. confusionMatrix(p1, train$LABELS)
2. confusionMatrix.default(p1, train$LABELS)
3. stop("`data` and `reference` should be factors with the same levels.",
. call. = FALSE)
此外,我已经尝试使用来自以下问题的想法来修复它:
Error in ConfusionMatrix the data and reference factors must have the same number of levels R CARET
Error in Confusion Matrix : the data and reference factors must have the same number of levels
但这对我的情况没有帮助。
你能帮我解决这个错误吗?
如果有任何想法,我将不胜感激,并提前 comments.Thank 你。
R 中的一个错误:
Error: `data` and `reference` should be factors with the same levels.
已通过更改 predict 函数中的 type 参数修复,正确代码:
# Prediction & Confusion Matrix - train data
p1 <- predict(rf2, train, type="response")
print("Prediction & Confusion Matrix - train data")
confusionMatrix(p1, train$LABELS)
@Camille,非常感谢)