提取的gbm最终模型与训练好的gbm模型return结果不一样

Question

我正在尝试使用从经过训练的 gbm 模型中提取的最终模型，但提取的模型不像经过训练的模型那样 return 分解结果。似乎提取的最终模型根据 returned 值工作，但是，它只是 returns 计算值。如何获得作为训练模型的因式分解结果。

library(caret)
library(mlbench)

data(Sonar)
set.seed(7)

Sonar$Class <- ifelse(Sonar$Class == 'R', 0, 1)
Sonar$Class <- as.factor(Sonar$Class)
validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]
outcomename <- 'Class'
predictors <- names(training)[!names(training) %in% outcomename]

set.seed(7)
control <- trainControl(method = "repeatedcv",  number = 5,  repeats = 5)
model_gbm <- train(training[, predictors], training[, outcomename], method = 'gbm', trControl = control, tuneLength = 10)

predict(model_gbm, validation[,1:60])
[1] 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Levels: 0 1

predict(model_gbm$finalModel, validation[,1:60], n.trees = 300)
[1]  -3.1174531  -1.8335718   5.0780422  -8.6681791   8.9634393  -1.4079936  11.7232458
[8]  18.4189859  14.3978772  11.3605253  13.4694812  10.2752696  11.4957672  10.0370462
[15]   8.6009983   0.3718381   0.1297673   2.4099186   6.7774090 -10.8356795 -10.1842065
[22]  -2.3222431  -8.1525336  -3.3665867 -10.7953353  -2.4607156 -11.4277641  -4.7164270
[29]  -6.3882544  -3.7306579  -6.9323133  -4.2643347  -0.2128462  -9.3395850 -13.0759289
[36] -12.8259643  -6.5314340 -12.7968160 -16.6217507 -12.0370978  -3.1100361

Answer 1

predict.gbm函数有一个type参数，可以是"response"或"link"。要获得预测概率，应将其设置为 "response"。然后将这些预测转换为 class 可以使用阈值（0.5 被插入符号火车使用）。在这里得到和想法是一个例子：

library(caret)
library(mlbench)

data(Sonar)
set.seed(7)

validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]

set.seed(7)
control <- trainControl(method = "repeatedcv",
                        number = 2,
                        repeats = 2)
model_gbm <- train(Class~.,
                   data = training,
                   method = 'gbm',
                   trControl = control,
                   tuneLength = 3)

使用插入符号预测：

preds1 <- predict(model_gbm, validation[,1:60], type = "prob")

使用 gbm 预测：

library(gbm)
preds2 <- predict(model_gbm$finalModel, validation[,1:60], n.trees = 100, type = "response")

all.equal(preds1[,1], preds2)
#output
TRUE

或者 classes:

preds1_class <- predict(model_gbm, validation[,1:60])

检查它们是否等于 gbm 预测阈值预测：

all.equal(
  as.factor(ifelse(preds2 > 0.5, "M", "R")),
  preds1_class)
#output
TRUE

提取的gbm最终模型与训练好的gbm模型return结果不一样

The extracted gbm final model does not return the same result as the trained gbm model

r

gbm

r-caret