提取的gbm最终模型与训练好的gbm模型return结果不一样

The extracted gbm final model does not return the same result as the trained gbm model

我正在尝试使用从经过训练的 gbm 模型中提取的最终模型,但提取的模型不像经过训练的模型那样 return 分解结果。似乎提取的最终模型根据 returned 值工作,但是,它只是 returns 计算值。如何获得作为训练模型的因式分解结果。

library(caret)
library(mlbench)

data(Sonar)
set.seed(7)

Sonar$Class <- ifelse(Sonar$Class == 'R', 0, 1)
Sonar$Class <- as.factor(Sonar$Class)
validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]
outcomename <- 'Class'
predictors <- names(training)[!names(training) %in% outcomename]

set.seed(7)
control <- trainControl(method = "repeatedcv",  number = 5,  repeats = 5)
model_gbm <- train(training[, predictors], training[, outcomename], method = 'gbm', trControl = control, tuneLength = 10)

predict(model_gbm, validation[,1:60])
[1] 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Levels: 0 1

predict(model_gbm$finalModel, validation[,1:60], n.trees = 300)
[1]  -3.1174531  -1.8335718   5.0780422  -8.6681791   8.9634393  -1.4079936  11.7232458
[8]  18.4189859  14.3978772  11.3605253  13.4694812  10.2752696  11.4957672  10.0370462
[15]   8.6009983   0.3718381   0.1297673   2.4099186   6.7774090 -10.8356795 -10.1842065
[22]  -2.3222431  -8.1525336  -3.3665867 -10.7953353  -2.4607156 -11.4277641  -4.7164270
[29]  -6.3882544  -3.7306579  -6.9323133  -4.2643347  -0.2128462  -9.3395850 -13.0759289
[36] -12.8259643  -6.5314340 -12.7968160 -16.6217507 -12.0370978  -3.1100361

predict.gbm函数有一个type参数,可以是"response"或"link"。要获得预测概率,应将其设置为 "response"。然后将这些预测转换为 class 可以使用阈值(0.5 被插入符号火车使用)。在这里得到和想法是一个例子:

library(caret)
library(mlbench)

data(Sonar)
set.seed(7)

validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]

set.seed(7)
control <- trainControl(method = "repeatedcv",
                        number = 2,
                        repeats = 2)
model_gbm <- train(Class~.,
                   data = training,
                   method = 'gbm',
                   trControl = control,
                   tuneLength = 3)

使用插入符号预测:

preds1 <- predict(model_gbm, validation[,1:60], type = "prob")

使用 gbm 预测:

library(gbm)
preds2 <- predict(model_gbm$finalModel, validation[,1:60], n.trees = 100, type = "response")

all.equal(preds1[,1], preds2)
#output
TRUE

或者 classes:

preds1_class <- predict(model_gbm, validation[,1:60])

检查它们是否等于 gbm 预测阈值预测:

all.equal(
  as.factor(ifelse(preds2 > 0.5, "M", "R")),
  preds1_class)
#output
TRUE