{caret}xgbTree 模型不是 运行 当包含权重时,没有它们运行良好
{caret}xgbTree model not running when weights included, runs fine without them
我有一个数据集,我可以毫无问题地构建没有权重的 xgbTree 模型,但是一旦我包含权重——即使权重全为 1——模型也不会收敛。我得到
Something is wrong; all the RMSE metric values are missing:
错误,当我打印警告时,最后一条消息是 In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :There were missing values in resampled performance measures.
。
This is a drive link 到包含信息的 RData 文件——它太大而无法打印,较小的样本并不总是重现错误。
它包含3个对象:input_x
、input_y
和wts
-- 最后一个只是一个1s的向量,但它最终应该能够接受理想情况下,区间 (0,1) 上的数字。我使用的代码如下所示。请注意产生错误的权重参数旁边的注释。
nrounds<-1000
tune_grid <- expand.grid(
nrounds = seq(from = 200, to = nrounds, by = 50),
eta = c(0.025, 0.05, 0.1, 0.3),
max_depth = c(2, 3, 4, 5),
gamma = 0,
colsample_bytree = 1,
min_child_weight = 1,
subsample = 1
)
tune_control <- caret::trainControl(
method = "cv",
number = 3,
verboseIter = FALSE,
allowParallel = TRUE
)
xgb_tune <- caret::train(
x = input_x,
y = input_y,
weights = wts, # If I remove this line, the code works fine. When included, even if just 1s, it throws an error.
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
编辑 2021 年 10 月 13 日。 感谢@waterpolo
指定权重的正确方法是通过 weights
参数传递给 caret::train
xgb_tune <- caret::train(
x = input_x,
y = input_y,
weights = wts,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
在此处查看更详细的答案:
下面是旧的错误答案:
根据 function source 权重参数称为 wts
。
行:
if (!is.null(wts))
xgboost::setinfo(x, 'weight', wts)
运行
xgb_tune <- caret::train(
x = input_x,
y = input_y,
wts = wts,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
应该会产生预期的结果。
只是想添加来自另一个 post () 的@missuse 响应。正确的参数是 weights
.
代码:
xgb_tune <- caret::train(x = input_x,
y = input_y,
weights = wts,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
我发现的另一件事是我需要使用大于 1 的权重,否则我会收到与您相同的错误消息。例如,如果我使用反向加权,我会收到与您相同的消息。希望这有帮助。
感谢@missuse 在另一个帖子中的可爱回复!
我有一个数据集,我可以毫无问题地构建没有权重的 xgbTree 模型,但是一旦我包含权重——即使权重全为 1——模型也不会收敛。我得到
Something is wrong; all the RMSE metric values are missing:
错误,当我打印警告时,最后一条消息是 In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :There were missing values in resampled performance measures.
。
This is a drive link 到包含信息的 RData 文件——它太大而无法打印,较小的样本并不总是重现错误。
它包含3个对象:input_x
、input_y
和wts
-- 最后一个只是一个1s的向量,但它最终应该能够接受理想情况下,区间 (0,1) 上的数字。我使用的代码如下所示。请注意产生错误的权重参数旁边的注释。
nrounds<-1000
tune_grid <- expand.grid(
nrounds = seq(from = 200, to = nrounds, by = 50),
eta = c(0.025, 0.05, 0.1, 0.3),
max_depth = c(2, 3, 4, 5),
gamma = 0,
colsample_bytree = 1,
min_child_weight = 1,
subsample = 1
)
tune_control <- caret::trainControl(
method = "cv",
number = 3,
verboseIter = FALSE,
allowParallel = TRUE
)
xgb_tune <- caret::train(
x = input_x,
y = input_y,
weights = wts, # If I remove this line, the code works fine. When included, even if just 1s, it throws an error.
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
编辑 2021 年 10 月 13 日。 感谢@waterpolo
指定权重的正确方法是通过 weights
参数传递给 caret::train
xgb_tune <- caret::train(
x = input_x,
y = input_y,
weights = wts,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
在此处查看更详细的答案:
下面是旧的错误答案:
根据 function source 权重参数称为 wts
。
行:
if (!is.null(wts))
xgboost::setinfo(x, 'weight', wts)
运行
xgb_tune <- caret::train(
x = input_x,
y = input_y,
wts = wts,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
应该会产生预期的结果。
只是想添加来自另一个 post (weights
.
代码:
xgb_tune <- caret::train(x = input_x,
y = input_y,
weights = wts,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = TRUE
)
我发现的另一件事是我需要使用大于 1 的权重,否则我会收到与您相同的错误消息。例如,如果我使用反向加权,我会收到与您相同的消息。希望这有帮助。
感谢@missuse 在另一个帖子中的可爱回复!