评估 ROC 指标，插入符包 - R

Question

我有这个代码：

model_nn <- train(
  Y ~ ., training,
  method = "nnet",
  metric = "ROC",
  trControl = trainControl(
    method = "cv", 
    number = 10,
    verboseIter = TRUE,
    classProbs = TRUE,
    summaryFunction = twoClassSummary
  )
)

nnprediction <- predict(model_nn, testing)
cmnn <-confusionMatrix(nnprediction,testing$Y)
print(cmnn)

哪个有效。但是，我无法评估 confusionMatrix 命令的 ROC 指标性能有多好。我该如何评估它，以便尝试一组不同的变量 and/or 机器学习算法来提高 ROC 性能？

PS：因变量是二的因数类。

Answer 1

只需输入 model_nn 即可为您提供训练期间使用的不同设置的 AUC 分数；这是一个示例，使用 iris 数据 (2 类) 的前 100 条记录：

library(caret)
library(nnet)

data(iris)
iris_reduced <- iris[1:100,]
iris_reduced <- droplevels(iris_reduced, "virginica")

model_nn <- train(
  Species ~ ., iris_reduced,
  method = "nnet",
  metric = "ROC",
  trControl = trainControl(
    method = "cv", 
    number = 5,
    verboseIter = TRUE,
    classProbs = TRUE,
    summaryFunction = twoClassSummary
  )
)

model_nn

结果：

Neural Network 

100 samples
  4 predictors
  2 classes: 'setosa', 'versicolor' 

No pre-processing
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 80, 80, 80, 80, 80 
Resampling results across tuning parameters:

  size  decay  ROC  Sens  Spec
  1     0e+00  1.0  1.0   1   
  1     1e-04  0.8  0.8   1   
  1     1e-01  1.0  1.0   1   
  3     0e+00  1.0  1.0   1   
  3     1e-04  1.0  1.0   1   
  3     1e-01  1.0  1.0   1   
  5     0e+00  1.0  1.0   1   
  5     1e-04  1.0  1.0   1   
  5     1e-01  1.0  1.0   1   

ROC was used to select the optimal model using  the largest value.
The final values used for the model were size = 1 and decay = 0.1.

顺便说一句，这里的术语 "ROC" 有点误导：返回的当然不是 ROC（它是曲线，而不是数字），但ROC曲线下的面积，即AUC（在trainControl中使用metric='AUC'具有相同的效果）。

评估 ROC 指标，插入符包 - R

Evaluate ROC metric, caret package - R

r

machine-learning

roc

auc

r-caret