随机森林分类器：预测概率的特征重要性

Question

我正在使用 sklearn RFC。

forest.fit(training_data, y_train)
probas_test = forest.predict_proba(test_data)

我想知道有没有办法找到导致预测的每个特征的贡献/重要性。

类似于，但针对单个数据点级别。

   forest.feature_importances_

Answer 1

这可以通过多种方式解决；检查 http://blog.datadive.net/interpreting-random-forests/ (and a Python package for that: https://github.com/andosa/treeinterpreter）。还有较少的直接选项，例如

Random Forest Classifier: feature importance of prediction probability