xgboost:前几轮没有学到任何东西
xgboost: first several round does not learn anything
当我训练 xgboost 并使用 AUC 作为评估性能的指标时,我注意到前几轮的 AUC 分数始终为 0.5。基本上这意味着前几棵树没有学到任何东西:
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.
Will train until eval-auc hasn't improved in 20 rounds.
[0] train-auc:0.5 eval-auc:0.5
[1] train-auc:0.5 eval-auc:0.5
[2] train-auc:0.5 eval-auc:0.5
[3] train-auc:0.5 eval-auc:0.5
[4] train-auc:0.5 eval-auc:0.5
[5] train-auc:0.5 eval-auc:0.5
[6] train-auc:0.5 eval-auc:0.5
[7] train-auc:0.5 eval-auc:0.5
[8] train-auc:0.5 eval-auc:0.5
[9] train-auc:0.5 eval-auc:0.5
[10] train-auc:0.5 eval-auc:0.5
[11] train-auc:0.5 eval-auc:0.5
[12] train-auc:0.5 eval-auc:0.5
[13] train-auc:0.5 eval-auc:0.5
[14] train-auc:0.537714 eval-auc:0.51776
[15] train-auc:0.541722 eval-auc:0.521087
[16] train-auc:0.555587 eval-auc:0.527019
[17] train-auc:0.669665 eval-auc:0.632106
[18] train-auc:0.6996 eval-auc:0.651677
[19] train-auc:0.721472 eval-auc:0.680481
[20] train-auc:0.722052 eval-auc:0.684549
[21] train-auc:0.736386 eval-auc:0.690942
如你所见,前13轮没有学到任何东西。
我使用的参数:
参数 = {'max_depth':6, 'eta':0.3, 'silent':1, 'objective':'binary:logistic'}
使用 xgboost 0.8
有什么办法可以防止这种情况发生吗?
谢谢
AUC 在前几轮中等于 0.5 并不意味着 XGBoost 不学习。检查您的数据集是否平衡。如果不是,所有实例(它们的 target=1 和 target=0)都会尝试从默认值 0.5 到目标均值,例如0.17(logloss improvement, learning is going on),然后到达提高logloss提高AUC的区域。如果你想帮助算法到达这个区域,将参数的默认值 base_score
=0.5 更改为目标均值。
https://xgboost.readthedocs.io/en/latest/parameter.html
当我训练 xgboost 并使用 AUC 作为评估性能的指标时,我注意到前几轮的 AUC 分数始终为 0.5。基本上这意味着前几棵树没有学到任何东西:
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.
Will train until eval-auc hasn't improved in 20 rounds.
[0] train-auc:0.5 eval-auc:0.5
[1] train-auc:0.5 eval-auc:0.5
[2] train-auc:0.5 eval-auc:0.5
[3] train-auc:0.5 eval-auc:0.5
[4] train-auc:0.5 eval-auc:0.5
[5] train-auc:0.5 eval-auc:0.5
[6] train-auc:0.5 eval-auc:0.5
[7] train-auc:0.5 eval-auc:0.5
[8] train-auc:0.5 eval-auc:0.5
[9] train-auc:0.5 eval-auc:0.5
[10] train-auc:0.5 eval-auc:0.5
[11] train-auc:0.5 eval-auc:0.5
[12] train-auc:0.5 eval-auc:0.5
[13] train-auc:0.5 eval-auc:0.5
[14] train-auc:0.537714 eval-auc:0.51776
[15] train-auc:0.541722 eval-auc:0.521087
[16] train-auc:0.555587 eval-auc:0.527019
[17] train-auc:0.669665 eval-auc:0.632106
[18] train-auc:0.6996 eval-auc:0.651677
[19] train-auc:0.721472 eval-auc:0.680481
[20] train-auc:0.722052 eval-auc:0.684549
[21] train-auc:0.736386 eval-auc:0.690942
如你所见,前13轮没有学到任何东西。
我使用的参数: 参数 = {'max_depth':6, 'eta':0.3, 'silent':1, 'objective':'binary:logistic'}
使用 xgboost 0.8
有什么办法可以防止这种情况发生吗?
谢谢
AUC 在前几轮中等于 0.5 并不意味着 XGBoost 不学习。检查您的数据集是否平衡。如果不是,所有实例(它们的 target=1 和 target=0)都会尝试从默认值 0.5 到目标均值,例如0.17(logloss improvement, learning is going on),然后到达提高logloss提高AUC的区域。如果你想帮助算法到达这个区域,将参数的默认值 base_score
=0.5 更改为目标均值。
https://xgboost.readthedocs.io/en/latest/parameter.html