XGBoost算法,关于evaulate_model函数的问题

XGBoost algorithm, question about the evaulate_model function

这个evaulate模型函数用的很频繁,我发现它在IBM用过here。但我会在这里展示这个功能:

def evaluate_model(alg, train, target, predictors, useTrainCV=True , cv_folds=5, early_stopping_rounds=50):

    if useTrainCV:
        xgb_param = alg.get_xgb_params()
        xgtrain = xgb.DMatrix(train[predictors].values, target['Default Flag'].values)
        cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,
            metrics='auc', early_stopping_rounds=early_stopping_rounds, verbose_eval=True)
        alg.set_params(n_estimators=cvresult.shape[0])

    #Fit the algorithm on the data
    alg.fit(train[predictors], target['Default Flag'], eval_metric='auc')

    #Predict training set:
    dtrain_predictions = alg.predict(train[predictors])
    dtrain_predprob = alg.predict_proba(train[predictors])[:,1]

    #Print model report:
    print("\nModel Report")
    print("Accuracy : %.6g" % metrics.accuracy_score(target['Default Flag'].values, dtrain_predictions))
    print("AUC Score (Train): %f" % metrics.roc_auc_score(target['Default Flag'], dtrain_predprob))  
    plt.figure(figsize=(12,12))
    feat_imp = pd.Series(alg.get_booster().get_fscore()).sort_values(ascending=False)
    feat_imp.plot(kind='bar', title='Feature Importance', color='g')
    plt.ylabel('Feature Importance Score')
    plt.show()

调整 XGboost 的参数后,我有

xgb4 = XGBClassifier(
    objective="binary:logistic", 
    learning_rate=0.10,  
    n_esimators=5000,
    max_depth=6,
    min_child_weight=1,
    gamma=0.1,
    subsample=0.8,
    colsample_bytree=0.8,
    reg_alpha=0.1,
    nthread=4,
    scale_pos_weight=1.0,
    seed=27)
features = [x for x in X_train.columns if x not in ['Default Flag','ID']]
evaluate_model(xgb4, X_train, y_train, features)

我得到的结果是

Model Report
Accuracy : 0.803236
AUC Score (Train): 0.856995

我的问题可能是消息不灵通,这个 evaulate_model() 函数没有在我发现奇怪的数据的测试集上进行测试。当我在测试集上调用它时 (evaluate_model(xgb4, X_test, y_test, features)) 我得到这个

Model Report
Accuracy : 0.873706
AUC Score (Train): 0.965286

鉴于测试集比训练集具有更高的准确性,我想知道这两个模型报告是否有任何关系。如果这个问题的结构表述不当,我深表歉意。

我会进一步完善我的答案:

此函数在您提供的数据集上进行训练,return 训练精度和 AUC:因此,这不是评估模型的可靠方法。

在你提供的link中,据说这个函数是用来调优估计器数量的:

The function below performs the following actions to find the best number of boosting trees to use on your data:

  • Trains an XGBoost model using features of the data.
  • Performs k-fold cross validation on the model, using accuracy and AUC score as the evaluation metric.
  • Returns output for each boosting round so you can see how the model is learning. You will look at the detailed output in the next
    section.
  • It stops running after the cross-validation score does not improve significantly with additional boosting rounds, giving you an
    optimal number of estimators for the model.

您不应使用它来评估模型性能,而应执行干净的交叉验证。

在这种情况下你的测试分数更高,因为你的测试集更小,所以模型更容易过拟合。