R插入符号:调整GLM提升修剪参数

R caret: Tuning GLM boost prune parameter

我正在尝试调整 GLM 增强模型的参数。根据有关此模型的 Caret package documentation 有 2 个参数可以调整,mstop 和 prune。

    library(caret)
    library(mlbench)

    data(Sonar)

    set.seed(25)
    trainIndex = createDataPartition(Sonar$Class, p = 0.9, list = FALSE)
    training = Sonar[ trainIndex,]
    testing  = Sonar[-trainIndex,]

    ### set training parameters
    fitControl = trainControl(method = "repeatedcv",
                              number = 10,
                              repeats = 10,
                              ## Estimate class probabilities
                              classProbs = TRUE,
                              ## Evaluate a two-class performances  
                              ## (ROC, sensitivity, specificity) using the following function 
                              summaryFunction = twoClassSummary)

    ### train the models
    set.seed(69)
    # Use the expand.grid to specify the search space   
    glmBoostGrid = expand.grid(mstop = c(50, 100, 150, 200, 250, 300),
                               prune = c('yes', 'no'))

    glmBoostFit = train(Class ~ ., 
                        data = training,
                        method = "glmboost",
                        trControl = fitControl,
                        tuneGrid = glmBoostGrid,
                        metric = 'ROC')
glmBoostFit

输出如下:

Boosted Generalized Linear Model 

188 samples
 60 predictors
  2 classes: 'M', 'R' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times) 
Summary of sample sizes: 169, 169, 169, 169, 170, 169, ... 
Resampling results across tuning parameters:

  mstop  ROC        Sens   Spec       ROC SD      Sens SD    Spec SD  
   50    0.8261806  0.764  0.7598611  0.10208114  0.1311104  0.1539477
  100    0.8265972  0.729  0.7625000  0.09459835  0.1391250  0.1385465
  150    0.8282083  0.717  0.7726389  0.09570417  0.1418152  0.1382405
  200    0.8307917  0.714  0.7769444  0.09484042  0.1439011  0.1452857
  250    0.8306667  0.719  0.7756944  0.09452604  0.1436740  0.1535578
  300    0.8278403  0.728  0.7722222  0.09794868  0.1425398  0.1576030

Tuning parameter 'prune' was held constant at a value of yes
ROC was used to select the optimal model using  the largest value.
The final values used for the model were mstop = 200 and prune = yes. 

修剪参数保持不变 (Tuning parameter 'prune' was held constant at a value of yes) 尽管 glmBoostGrid 也包含 prune == no。我在boost_control方法处查看了mboost包文档,只有mstop参数可以访问,那么prune参数如何与[=19调优=] train 方法的参数?

不同之处在于这部分对 glmboost 的调用:

if (param$prune == "yes") {
    out <- if (is.factor(y)) 
        out[mstop(AIC(out, "classical"))]
    else out[mstop(AIC(out))]
}

区别在于aic的计算方式。但是 运行 在插入符中使用 glmboost 进行的各种测试我怀疑它是否按预期运行。我在 github 中创建了一个问题,看看我的怀疑是否正确。如果开发人员提供更多信息,我将编辑我的答案。