eli5 permuter.feature_importances_ 返回全零
eli5 permuter.feature_importances_ returning all zeros
我试图在一小部分数据样本上获取 RandomForestClassifier 的排列重要性,但是虽然我可以获得简单的特征重要性,但我的排列重要性返回为全零。
这是代码:
输入1:
X_train_encoded = encoder.fit_transform(X_train1)
X_val_encoded = encoder.transform(X_val1)
model = RandomForestClassifier(n_estimators=300, random_state=25,
n_jobs=-1,max_depth=2)
model.fit(X_train_encoded, y_train1)
输出1:
RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
criterion='gini', max_depth=2, max_features='auto',
max_leaf_nodes=None, max_samples=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=300,
n_jobs=-1, oob_score=False, random_state=25, verbose=0,
warm_start=False)
输入2:
permuter = PermutationImportance(
model,
scoring='accuracy',
n_iter=3,
random_state=25
)
permuter.fit(X_val_encoded, y_val1)
输出2:
PermutationImportance(cv='prefit',
estimator=RandomForestClassifier(bootstrap=True,
ccp_alpha=0.0,
class_weight=None,
criterion='gini',
max_depth=2,
max_features='auto',
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weight_fraction_leaf=0.0,
n_estimators=300,
n_jobs=-1,
oob_score=False,
random_state=25,
verbose=0,
warm_start=False),
n_iter=3, random_state=25, refit=True,
scoring='accuracy')
(问题)输入 3:
feature_names = X_val_encoded.columns.tolist()
pd.Series(permuter.feature_importances_, feature_names).sort_values()
(问题)输出 3:
Player 0.0
POS 0.0
ATT 0.0
YDS 0.0
TDS 0.0
REC 0.0
YDS.1 0.0
TDS.1 0.0
FL 0.0
FPTS 0.0
Overall 0.0
pos_adp 0.0
dtype: float64
我希望在这里得到值,但我得到的却是零——我是做错了什么还是
可能的结果?
In: permuter.feature_importances_
Out:array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
原来问题出在我传入的数据上,而不是代码本身。
数据的观测值少于 70 个,因此在我能够向其添加更多观测值(不到 400 个)之后,我能够按预期获得排列重要性。
我试图在一小部分数据样本上获取 RandomForestClassifier 的排列重要性,但是虽然我可以获得简单的特征重要性,但我的排列重要性返回为全零。
这是代码:
输入1:
X_train_encoded = encoder.fit_transform(X_train1)
X_val_encoded = encoder.transform(X_val1)
model = RandomForestClassifier(n_estimators=300, random_state=25,
n_jobs=-1,max_depth=2)
model.fit(X_train_encoded, y_train1)
输出1:
RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
criterion='gini', max_depth=2, max_features='auto',
max_leaf_nodes=None, max_samples=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=300,
n_jobs=-1, oob_score=False, random_state=25, verbose=0,
warm_start=False)
输入2:
permuter = PermutationImportance(
model,
scoring='accuracy',
n_iter=3,
random_state=25
)
permuter.fit(X_val_encoded, y_val1)
输出2:
PermutationImportance(cv='prefit',
estimator=RandomForestClassifier(bootstrap=True,
ccp_alpha=0.0,
class_weight=None,
criterion='gini',
max_depth=2,
max_features='auto',
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weight_fraction_leaf=0.0,
n_estimators=300,
n_jobs=-1,
oob_score=False,
random_state=25,
verbose=0,
warm_start=False),
n_iter=3, random_state=25, refit=True,
scoring='accuracy')
(问题)输入 3:
feature_names = X_val_encoded.columns.tolist()
pd.Series(permuter.feature_importances_, feature_names).sort_values()
(问题)输出 3:
Player 0.0
POS 0.0
ATT 0.0
YDS 0.0
TDS 0.0
REC 0.0
YDS.1 0.0
TDS.1 0.0
FL 0.0
FPTS 0.0
Overall 0.0
pos_adp 0.0
dtype: float64
我希望在这里得到值,但我得到的却是零——我是做错了什么还是 可能的结果?
In: permuter.feature_importances_
Out:array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
原来问题出在我传入的数据上,而不是代码本身。
数据的观测值少于 70 个,因此在我能够向其添加更多观测值(不到 400 个)之后,我能够按预期获得排列重要性。