使用 Caret 包的测试集的 ROC 曲线

Question

我正在尝试从测试集上的插入符号获取最佳模型的 ROC 曲线。我遇到了似乎很方便的 MLeval 包（输出非常详尽，使用几行代码提供了所有需要的指标和图表）。一个很好的例子在这里：

我正在尝试下面的代码并能够获得训练集所需的 metrics/graphs 但是当我尝试在测试集上工作时总是出错。

library(caret)
library(MLeval)
data(GermanCredit)

Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]


ctrl <- trainControl(method = "repeatedcv", number = 10, classProbs = TRUE, savePredictions = TRUE)

mod_fit <- train(Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own + 
    CreditHistory.Critical,  data=training, method="glm", family="binomial",
    trControl = ctrl, tuneLength = 5, metric = "ROC")

pred <- predict(mod_fit, newdata=testing)
confusionMatrix(data=pred, testing$Class)

test = evalm(mod_fit) # this gives the ROC curve for test set

test1 <- evalm(pred) # I am trying this to calculate the ROC curve for the test set (I understand this should be the final curve to report), but I keep getting this error:

Error in evalm(pred) : Data frame or Caret train object required please.

在包网站上，第一个参数可以是带有概率和观察数据的数据框。你知道如何使用插入符号准备这个数据框吗？ https://www.rdocumentation.org/packages/MLeval/versions/0.1/topics/evalm

谢谢

更新：

这应该是正确的脚本，除了在一张图上显示多个 ROC 外，效果很好：

library(caret)
library(MLeval)
data(GermanCredit)

Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]


ctrl <- trainControl(method = "repeatedcv", number = 10, classProbs = TRUE, savePredictions = TRUE)

mod_fit <- train(Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own + 
    CreditHistory.Critical,  data=training, method="glm", family="binomial",
    trControl = ctrl, tuneLength = 5, metric = "ROC")

#pred <- predict(mod_fit, newdata=testing, type="prob")

confusionMatrix(data=pred, testing$Class)

test = evalm(mod_fit) # this gives the ROC curve for test set
m1 = data.frame(pred, testing$Class)
 
test1 <- evalm(m1)

#Train and eval a second model: 
mod_fit2 <- train(Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own,  
data=training, method="glm", family="binomial",
    trControl = ctrl, tuneLength = 5, metric = "ROC")


pred2 <- predict(mod_fit2, newdata=testing, type="prob")
m2 = data.frame(pred2, testing$Class)

test2 <- evalm(m2)


# Display ROCs for both models in one graph: 

compare <- evalm(list(m1, m1), gnames=c('logistic1','logistic2'))

我从这个来源得到了代码的最后一步：https://www.r-bloggers.com/how-to-easily-make-a-roc-curve-in-r/

但是它只显示一条 ROC 曲线（如果我想显示插入符序列输出则效果很好）

Answer 1

您可以使用以下代码

library(MLeval)
pred <- predict(mod_fit, newdata=testing, type="prob")
test1 <- evalm(data.frame(pred, testing$Class))

使用 Caret 包的测试集的 ROC 曲线

ROC curve for the testing set using Caret package

r

roc

r-caret