RandomForestClassifier - 尝试识别 sklearn 中的特征重要性时出现奇怪的错误？

Question

我正在尝试检索 RandomForestClassifier 模型中特征的重要性，检索模型中每个特征的系数，

我这里是运行下面的代码，

random_forest =  SelectFromModel(RandomForestClassifier(n_estimators = 200, random_state = 123))
random_forest.fit(X_train, y_train)
print(random_forest.estimator.feature_importances_)

但收到以下错误

NotFittedError: This RandomForestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

我到底做错了什么？你可以看到我在确定特征的重要性之前就对模型进行了拟合，但它似乎并没有像它应该的那样工作，

同样，我有下面的代码和 LogisticRegression 模型，它工作正常，

log_reg = SelectFromModel(LogisticRegression(class_weight = "balanced", random_state = 123))
log_reg.fit(X_train, y_train)
print(log_reg.estimator_.coef_)

Answer 1

您必须调用属性 estimator_ 才能访问 fitted 估算器（参见 docs）。请注意，您忘记了结尾的 _。所以应该是：

print(random_forest.estimator_.feature_importances_)

有趣的是，您使用 LogisticRegression 模型正确地完成了示例。

RandomForestClassifier - 尝试识别 sklearn 中的特征重要性时出现奇怪的错误？

RandomForestClassifier - Odd error with trying to identify feature importance in sklearn?

python

scikit-learn

sklearn-pandas