XGBoost 图重要性没有 属性 max_num_features

XGBoost plot importance has no property max_num_features

xgboost 的 plotting API 状态:

xgboost.plot_importance(booster, ax=None, height=0.2, xlim=None, ylim=None, title='Feature importance', xlabel='F score', ylabel='Features', importance_type='weight', max_num_features=None, grid=True, **kwargs)¶



booster (Booster, XGBModel or dict) – Booster or XGBModel instance, or dict taken by Booster.get_fscore()
max_num_features (int, default None) – Maximum number of top features displayed on plot. If None, all features will be displayed.


booster_ = XGBClassifier(learning_rate=0.1, max_depth=3, n_estimators=100, 
                      silent=False, objective='binary:logistic', nthread=-1, 
                      gamma=0, min_child_weight=1, max_delta_step=0, subsample=1, 
                      colsample_bytree=1, colsample_bylevel=1, reg_alpha=0,
                      reg_lambda=1, scale_pos_weight=1, base_score=0.5, seed=0)

booster_.fit(X_train, y_train)

from xgboost import plot_importance
plot_importance(booster_, max_num_features=10)


AttributeError: Unknown property max_num_features

虽然 运行 它没有参数 max_num_features 正确绘制了整个特征集(在我的例子中是巨大的,~10k 特征)。 知道发生了什么事吗?



> python -V
  Python 2.7.12 :: Anaconda custom (x86_64)

> pip freeze | grep xgboost


def feat_imp(df, model, n_features):

    d = dict(zip(df.columns, model.feature_importances_))
    ss = sorted(d, key=d.get, reverse=True)
    top_names = ss[0:n_features]

    plt.title("Feature importances")
    plt.bar(range(n_features), [d[i] for i in top_names], color="r", align="center")
    plt.xlim(-1, n_features)
    plt.xticks(range(n_features), top_names, rotation='vertical')

 feat_imp(filled_train_full, booster_, 20)

尝试将您的 xgboost 库升级到 0.6。它应该可以解决问题。 要升级包,试试这个:

$ pip install -U xgboost


$ brew install gcc@5
$ pip install -U xgboost


尽管文档标题为 webpage ("Python API Reference - xgboost 0.6 documentation"),但它不包含 xgboost 0.6 版本的文档。相反,它 似乎 包含最新 git master 分支的文档。

xgboost 的 0.6 版本发布于 Jul 29 2016:

This is a stable release of 0.6 version

@tqchen tqchen released this on Jul 29 2016 · 245 commits to master since this release

添加 plot_importance()max_num_features 的提交是在 Jan 16 2017:

作为进一步检查,让我们检查 0.60 版本压缩包:

pushd /tmp
curl -SLO https://github.com/dmlc/xgboost/archive/v0.60.tar.gz
tar -xf v0.60.tar.gz 
grep num_features xgboost-0.60/python-package/xgboost/plotting.py
# .. silence.

因此这似乎是 xgboost 项目的文档错误。


max = 50
xgboost.plot_importance(dict(sorted(bst.get_fscore().items(), reverse = True, key=lambda x:x[1])[:max]), ax = ax, height = 0.8)

因为你也可以将字典传递给情节,你基本上得到了 fscore,以相反的顺序对项目进行排序,select 所需数量的顶级特征然后转换回字典。
