关于 LSTM Keras 排列重要性的问题

Question

from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor   
import eli5
from eli5.sklearn import PermutationImportance

model = Sequential()
model.add(LSTM(units=30,return_sequences= True, input_shape=(X.shape[1],421)))
model.add(Dropout(rate=0.2))
model.add(LSTM(units=30, return_sequences=True))
model.add(LSTM(units=30))
model.add(Dense(units=1, activation='relu'))

perm = PermutationImportance(model, scoring='accuracy',random_state=1).fit(X, y, epochs=500, batch_size=8)
eli5.show_weights(perm, feature_names = X.columns.tolist())

我运行一个 LSTM 只是为了查看包含 400 多个特征的数据集的特征重要性。我使用 Keras scikit-learn 包装器来使用 eli5 的 PermutationImportance 函数。但是代码正在返回

ValueError: Found array with dim 3. Estimator expected <= 2.

如果我使用model.fit()，代码运行流畅，但无法调试排列重要性的错误。有人知道怎么回事吗？

Answer 1

eli5's scikitlearn implementation for determining permutation importance can only process 2d arrays while keras' LSTM 图层需要 3d 数组。此错误是 known issue，但似乎还没有解决方案。

我知道这并不能真正回答你让 eli5 使用 LSTM 的问题（因为它目前不能），但我遇到了同样的问题并使用了另一个名为 SHAP 的库获得我的 LSTM 模型的特征重要性。这是我的一些代码，可帮助您入门：

import shap
DE = shap.DeepExplainer(model, X_train) # X_train is 3d numpy.ndarray
shap_values = DE.shap_values(X_validate_np, check_additivity=False) # X_validate is 3d numpy.ndarray

shap.initjs()
shap.summary_plot(
    shap_values[0], 
    X_validate,
    feature_names=list_of_your_columns_here,
    max_display=50,
    plot_type='bar')

这是您可以获得的图表示例：

希望对您有所帮助。

关于 LSTM Keras 排列重要性的问题

Question about Permutation Importance on LSTM Keras

scikit-learn

lstm

keras

eli5