R:在 xgboost 中提取初始化预测
R: extracting initialized predictions in xgboost
library(xgboost)
data(agaricus.train, package='xgboost')
# Initialize baseline predictions to be 0
baseline_predictions <- rep(1.5, nrow(agaricus.train$data))
# base_margin is the base prediction Xgboost will boost from ;
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label, base_margin = baseline_predictions)
param <- list(max_depth = 2, eta = 1, verbose = 0, nthread = 2,
objective = "binary:logistic", eval_metric = "auc")
bst <- xgb.train(param, dtrain, nrounds = 2)
> xgb.dump(bst, with_stats = T)
[1] "booster[0]"
[2] "0:[f28<-9.53674316e-07] yes=1,no=2,missing=1,gain=6691.7876,cover=971.39093"
[3] "1:[f55<-9.53674316e-07] yes=3,no=4,missing=3,gain=1923.16174,cover=551.54364"
[4] "3:leaf=0.742681563,cover=484.427734"
[5] "4:leaf=-4.93142509,cover=67.1159134"
[6] "2:[f108<-9.53674316e-07] yes=5,no=6,missing=5,gain=336.239258,cover=419.847321"
[7] "5:leaf=-5.37396955,cover=411.942535"
[8] "6:leaf=1.08577335,cover=7.90476274"
[9] "booster[1]"
[10] "0:[f59<-9.53674316e-07] yes=1,no=2,missing=1,gain=1517.97913,cover=354.008148"
[11] "1:[f66<-9.53674316e-07] yes=3,no=4,missing=3,gain=1250.927,cover=340.298492"
[12] "3:leaf=0.488599688,cover=338.470062"
[13] "4:leaf=21.6099014,cover=1.82844138"
[14] "2:leaf=-9.71027374,cover=13.709651"
在上面的代码中,我通过指定 base_margin = baseline_predictions
将训练数据中所有观察值的预测初始化为 1.5。
使用 xgb.dump,
我能够看到生成的树是合适的。我的问题是,是否也可以提取初始预测?也就是说,给定一个 XGBoost 模型 bst
,我可以提取基线预测(即所有观察结果为 1.5)?
解决此问题的方法是使用xgboost::getinfo(object = dtrain, name = "base_margin")
来获取baseline_predictions。无论它们是预先设置为(例如本例中的“1.5”)还是根据初步训练计算 运行(例如 https://github.com/dmlc/xgboost/blob/master/R-package/demo/boost_from_prediction.R 中的 baseline_predictions,这都是有用的)
library(xgboost)
data(agaricus.train, package='xgboost')
# Initialize baseline predictions to be 0
baseline_predictions <- rep(1.5, nrow(agaricus.train$data))
# base_margin is the base prediction Xgboost will boost from ;
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label, base_margin = baseline_predictions)
param <- list(max_depth = 2, eta = 1, verbose = 0, nthread = 2,
objective = "binary:logistic", eval_metric = "auc")
bst <- xgb.train(param, dtrain, nrounds = 2)
> xgb.dump(bst, with_stats = T)
[1] "booster[0]"
[2] "0:[f28<-9.53674316e-07] yes=1,no=2,missing=1,gain=6691.7876,cover=971.39093"
[3] "1:[f55<-9.53674316e-07] yes=3,no=4,missing=3,gain=1923.16174,cover=551.54364"
[4] "3:leaf=0.742681563,cover=484.427734"
[5] "4:leaf=-4.93142509,cover=67.1159134"
[6] "2:[f108<-9.53674316e-07] yes=5,no=6,missing=5,gain=336.239258,cover=419.847321"
[7] "5:leaf=-5.37396955,cover=411.942535"
[8] "6:leaf=1.08577335,cover=7.90476274"
[9] "booster[1]"
[10] "0:[f59<-9.53674316e-07] yes=1,no=2,missing=1,gain=1517.97913,cover=354.008148"
[11] "1:[f66<-9.53674316e-07] yes=3,no=4,missing=3,gain=1250.927,cover=340.298492"
[12] "3:leaf=0.488599688,cover=338.470062"
[13] "4:leaf=21.6099014,cover=1.82844138"
[14] "2:leaf=-9.71027374,cover=13.709651"
在上面的代码中,我通过指定 base_margin = baseline_predictions
将训练数据中所有观察值的预测初始化为 1.5。
使用 xgb.dump,
我能够看到生成的树是合适的。我的问题是,是否也可以提取初始预测?也就是说,给定一个 XGBoost 模型 bst
,我可以提取基线预测(即所有观察结果为 1.5)?
解决此问题的方法是使用xgboost::getinfo(object = dtrain, name = "base_margin")
来获取baseline_predictions。无论它们是预先设置为(例如本例中的“1.5”)还是根据初步训练计算 运行(例如 https://github.com/dmlc/xgboost/blob/master/R-package/demo/boost_from_prediction.R 中的 baseline_predictions,这都是有用的)