如何在同一列表中使用插入符号包呈现不同模型的准确性

How to present accuracy of different models using caret package in the same list

我正在尝试使用插入符号测试模型性能 package.I 得到了每个模型的结果,但我想得到一个列表,其中包含所有模型的准确性和 ROC together.How 我可以吗? 这是我的玩具数据和两个型号:

dat <- read.table(text = " target birds    wolfs     snakes
        0        3        9         7
        1        3        8         4
        1        1        2         8
        0        1        2         3
        0        1        8         3
        1        6        1         2
        0        6        7         1
        1        6        1         5
        0        5        9         7
        1        3        8         7
        1        4        2         7
        0        1        2         3
        0        7        6         3
        1        6        1         1
        0        6        3         9
        1        6        1         1   ",header = TRUE)

以下是两个模型:

svmRadial <- train(target ~ ., data = dat, method='svmRadial')
glm <- train(target ~ ., data = dat, method='glm')

我想得到这样一个 table 一个输出:

ModelName  Accuracy  ROC
svmRadial   0.95     0.74
glm         0.93     0.7

这本质上是一个关于自定义 summaryFunction 的问题。可以看到类似的问题here。这是一个函数,它是 defaultSummarytwoClassSummary 函数的组合。

mySummary <- function(data, lev = NULL, model = NULL)
{
    requireNamespace("pROC")
    if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
        stop("levels of observed and predicted data do not match")
    rocObject <- try(pROC::roc.default(data$obs, data[, lev[1]]), 
                     silent = TRUE)
    rocAUC <- if (class(rocObject)[1] == "try-error"){ 
        NA
    }else{rocObject$auc}

    if (!is.factor(data$obs)) 
        data$obs <- factor(data$obs, levels = lev)
    Acc <- postResample(data[, "pred"], data[, "obs"])[1]

    out <- c(Acc, rocAUC)
    names(out) <- c("Accuracy","ROC")
    out
}


fitControl <- trainControl(classProbs = TRUE,
                           summaryFunction = mySummary)

set.seed(123)
svmRadial_acc_roc <- train(as.factor(target) ~ ., data = dat, method='svmRadial', trControl=fitControl)
glm_acc_roc <- train(as.factor(target) ~ ., data = dat, method='glm', trControl=fitControl)

我认为查看结果的分布被认为是更好的做法。为此,您可以使用 resamples 函数。

results <- resamples(list(svm=svmRadial_acc_roc, glm=glm_acc_roc))
summary(results)

Call:
summary.resamples(object = results)

Models: svm, glm 
Number of resamples: 25 

Accuracy 
      Min. 1st Qu. Median   Mean 3rd Qu.   Max. NA's
svm 0.2500  0.5000  0.625 0.6034  0.6667 1.0000    0
glm 0.1667  0.4286  0.500 0.4993  0.6000 0.7143    0

ROC 
      Min. 1st Qu. Median   Mean 3rd Qu. Max. NA's
svm 0.4444  0.5608 0.6667 0.7422     1.0    1    1
glm 0.4444  0.6250 0.6667 0.7108     0.8    1    0

就是说,如果你真的想要那么简单 table。

# svm had some cross-validation so pull 'best tune'
svm_result <- svmRadial_acc_roc$results[
    svmRadial_acc_roc$results$C == svmRadial_acc_roc$bestTune$C,
    c("Accuracy", "ROC")]
glm_result <- glm_acc_roc$results[,c("Accuracy", "ROC")]

# make data.frame
data.frame(ModelName = c("svmRadial", "glm"),
           Accuracy = c(svm_result$Accuracy, glm_result$Accuracy),
           ROC = c(svm_result$ROC, glm_result$ROC)
)

  ModelName  Accuracy       ROC
1 svmRadial 0.6034444 0.7421875
2       glm 0.4993333 0.7107778