{caret}xgbTree 模型不是 运行 当包含权重时,没有它们运行良好

{caret}xgbTree model not running when weights included, runs fine without them

我有一个数据集,我可以毫无问题地构建没有权重的 xgbTree 模型,但是一旦我包含权重——即使权重全为 1——模型也不会收敛。我得到 Something is wrong; all the RMSE metric values are missing: 错误,当我打印警告时,最后一条消息是 In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :There were missing values in resampled performance measures.

This is a drive link 到包含信息的 RData 文件——它太大而无法打印,较小的样本并不总是重现错误。

它包含3个对象:input_xinput_ywts -- 最后一个只是一个1s的向量,但它最终应该能够接受理想情况下,区间 (0,1) 上的数字。我使用的代码如下所示。请注意产生错误的权重参数旁边的注释。

nrounds<-1000

tune_grid <- expand.grid(
  nrounds = seq(from = 200, to = nrounds, by = 50),
  eta = c(0.025, 0.05, 0.1, 0.3),
  max_depth = c(2, 3, 4, 5),
  gamma = 0,
  colsample_bytree = 1,
  min_child_weight = 1,
  subsample = 1
)

tune_control <- caret::trainControl(
  method = "cv", 
  number = 3, 
  verboseIter = FALSE, 
  allowParallel = TRUE 
)

xgb_tune <- caret::train(
    x = input_x,
    y = input_y,
    weights = wts, # If I remove this line, the code works fine. When included, even if just 1s, it throws an error.
    trControl = tune_control,
    tuneGrid = tune_grid,
    method = "xgbTree",
    verbose = TRUE
  )

编辑 2021 年 10 月 13 日。 感谢@waterpolo

指定权重的正确方法是通过 weights 参数传递给 caret::train

xgb_tune <- caret::train(
    x = input_x,
    y = input_y,
    weights = wts,
    trControl = tune_control,
    tuneGrid = tune_grid,
    method = "xgbTree",
    verbose = TRUE
  )

在此处查看更详细的答案:

下面是旧的错误答案:

根据 function source 权重参数称为 wts

行:

if (!is.null(wts))
  xgboost::setinfo(x, 'weight', wts)

运行

xgb_tune <- caret::train(
    x = input_x,
    y = input_y,
    wts = wts,
    trControl = tune_control,
    tuneGrid = tune_grid,
    method = "xgbTree",
    verbose = TRUE
  )

应该会产生预期的结果。

只是想添加来自另一个 post () 的@missuse 响应。正确的参数是 weights .

代码:

xgb_tune <- caret::train(x = input_x,
    y = input_y,
    weights = wts,
    trControl = tune_control,
    tuneGrid = tune_grid,
    method = "xgbTree",
    verbose = TRUE
  )

我发现的另一件事是我需要使用大于 1 的权重,否则我会收到与您相同的错误消息。例如,如果我使用反向加权,我会收到与您相同的消息。希望这有帮助。

感谢@missuse 在另一个帖子中的可爱回复!