R插入符号:调整GLM提升修剪参数
R caret: Tuning GLM boost prune parameter
我正在尝试调整 GLM 增强模型的参数。根据有关此模型的 Caret package documentation 有 2 个参数可以调整,mstop 和 prune。
library(caret)
library(mlbench)
data(Sonar)
set.seed(25)
trainIndex = createDataPartition(Sonar$Class, p = 0.9, list = FALSE)
training = Sonar[ trainIndex,]
testing = Sonar[-trainIndex,]
### set training parameters
fitControl = trainControl(method = "repeatedcv",
number = 10,
repeats = 10,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate a two-class performances
## (ROC, sensitivity, specificity) using the following function
summaryFunction = twoClassSummary)
### train the models
set.seed(69)
# Use the expand.grid to specify the search space
glmBoostGrid = expand.grid(mstop = c(50, 100, 150, 200, 250, 300),
prune = c('yes', 'no'))
glmBoostFit = train(Class ~ .,
data = training,
method = "glmboost",
trControl = fitControl,
tuneGrid = glmBoostGrid,
metric = 'ROC')
glmBoostFit
输出如下:
Boosted Generalized Linear Model
188 samples
60 predictors
2 classes: 'M', 'R'
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 169, 169, 169, 169, 170, 169, ...
Resampling results across tuning parameters:
mstop ROC Sens Spec ROC SD Sens SD Spec SD
50 0.8261806 0.764 0.7598611 0.10208114 0.1311104 0.1539477
100 0.8265972 0.729 0.7625000 0.09459835 0.1391250 0.1385465
150 0.8282083 0.717 0.7726389 0.09570417 0.1418152 0.1382405
200 0.8307917 0.714 0.7769444 0.09484042 0.1439011 0.1452857
250 0.8306667 0.719 0.7756944 0.09452604 0.1436740 0.1535578
300 0.8278403 0.728 0.7722222 0.09794868 0.1425398 0.1576030
Tuning parameter 'prune' was held constant at a value of yes
ROC was used to select the optimal model using the largest value.
The final values used for the model were mstop = 200 and prune = yes.
修剪参数保持不变 (Tuning parameter 'prune' was held constant at a value of yes
) 尽管 glmBoostGrid
也包含 prune == no
。我在boost_control
方法处查看了mboost
包文档,只有mstop
参数可以访问,那么prune
参数如何与[=19调优=] train
方法的参数?
不同之处在于这部分对 glmboost 的调用:
if (param$prune == "yes") {
out <- if (is.factor(y))
out[mstop(AIC(out, "classical"))]
else out[mstop(AIC(out))]
}
区别在于aic的计算方式。但是 运行 在插入符中使用 glmboost 进行的各种测试我怀疑它是否按预期运行。我在 github 中创建了一个问题,看看我的怀疑是否正确。如果开发人员提供更多信息,我将编辑我的答案。
我正在尝试调整 GLM 增强模型的参数。根据有关此模型的 Caret package documentation 有 2 个参数可以调整,mstop 和 prune。
library(caret)
library(mlbench)
data(Sonar)
set.seed(25)
trainIndex = createDataPartition(Sonar$Class, p = 0.9, list = FALSE)
training = Sonar[ trainIndex,]
testing = Sonar[-trainIndex,]
### set training parameters
fitControl = trainControl(method = "repeatedcv",
number = 10,
repeats = 10,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate a two-class performances
## (ROC, sensitivity, specificity) using the following function
summaryFunction = twoClassSummary)
### train the models
set.seed(69)
# Use the expand.grid to specify the search space
glmBoostGrid = expand.grid(mstop = c(50, 100, 150, 200, 250, 300),
prune = c('yes', 'no'))
glmBoostFit = train(Class ~ .,
data = training,
method = "glmboost",
trControl = fitControl,
tuneGrid = glmBoostGrid,
metric = 'ROC')
glmBoostFit
输出如下:
Boosted Generalized Linear Model
188 samples
60 predictors
2 classes: 'M', 'R'
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 169, 169, 169, 169, 170, 169, ...
Resampling results across tuning parameters:
mstop ROC Sens Spec ROC SD Sens SD Spec SD
50 0.8261806 0.764 0.7598611 0.10208114 0.1311104 0.1539477
100 0.8265972 0.729 0.7625000 0.09459835 0.1391250 0.1385465
150 0.8282083 0.717 0.7726389 0.09570417 0.1418152 0.1382405
200 0.8307917 0.714 0.7769444 0.09484042 0.1439011 0.1452857
250 0.8306667 0.719 0.7756944 0.09452604 0.1436740 0.1535578
300 0.8278403 0.728 0.7722222 0.09794868 0.1425398 0.1576030
Tuning parameter 'prune' was held constant at a value of yes
ROC was used to select the optimal model using the largest value.
The final values used for the model were mstop = 200 and prune = yes.
修剪参数保持不变 (Tuning parameter 'prune' was held constant at a value of yes
) 尽管 glmBoostGrid
也包含 prune == no
。我在boost_control
方法处查看了mboost
包文档,只有mstop
参数可以访问,那么prune
参数如何与[=19调优=] train
方法的参数?
不同之处在于这部分对 glmboost 的调用:
if (param$prune == "yes") {
out <- if (is.factor(y))
out[mstop(AIC(out, "classical"))]
else out[mstop(AIC(out))]
}
区别在于aic的计算方式。但是 运行 在插入符中使用 glmboost 进行的各种测试我怀疑它是否按预期运行。我在 github 中创建了一个问题,看看我的怀疑是否正确。如果开发人员提供更多信息,我将编辑我的答案。