在瀑布图和蜂群图中绘制 SHAP 值
Plot SHAP values in waterfall and beeswarm plots
我正在使用随机森林进行二元分类。但是,我正在尝试使用 SHAP 来解释模型预测。但是,我不断收到以下错误。我正在学习教程 here
import shap
explainer = shap.Explainer(rf_boruta) #pass my model
shap_values = explainer(ord_test_t) #pass my test dataset
sample_idx = 15
shap_vals = explainer.shap_values(ord_test_t.iloc[sample_idx:sample_idx+1])
print("Base Value : ", explainer.expected_value)
print()
print("Shap Values for Sample %d : "%sample_idx, shap_vals)
print("\n")
print("Prediction From Model : ", rf_boruta.predict(ord_test_t.iloc[15:16]))
print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + shap_vals.sum())
我收到如下所示的错误
> 8 print("\n")
> 9 print("Prediction From Model : ", rf_boruta.predict(ord_test_t.iloc[sample_idx:sample_idx+1]))
> ---> 10 print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + shap_vals.sum())
>
> AttributeError: 'list' object has no attribute 'sum'
当我尝试另一个教程时 here,我又遇到了一个错误
explainer = shap.TreeExplainer(rf_boruta,ord_test_t)
shap_values = explainer.shap_values(ord_test_t)
sample_ind = 0
shap.waterfall_plot(explainer.expected_value, shap_values[sample_ind],ord_test_t.iloc[sample_ind])
TypeError: waterfall() got multiple values for argument 'max_display'
稍后,当我将其更改为默认值时,出现另一个错误,如下所示
> ---> 46 base_values = shap_values.base_values
> 47
> 48 features = shap_values.data
>
> AttributeError: 'numpy.ndarray' object has no attribute 'base_values'
更新 - 尝试了另一个代码
row_to_show = 5
data_for_prediction = ord_test_t.iloc[row_to_show] # use 1 row of data here. Could use multiple rows if desired
data_for_prediction_array = data_for_prediction.values.reshape(1, -1)
rf_boruta.predict_proba(data_for_prediction_array)
explainer = shap.TreeExplainer(rf_boruta)
# Calculate Shap values
shap_values = explainer.shap_values(data_for_prediction)
shap.waterfall_plot(explainer.expected_value,shap_values,data_for_prediction)
错误回调告诉你问题是什么:
10 print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + shap_vals.sum())
AttributeError: 'list' object has no attribute 'sum'
因此,不要在列表中调用 shap_vals.sum()
,而是以 支持的方式获取总和,例如使用 built-in sum
函数:
print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + sum(shap_vals))
我正在使用随机森林进行二元分类。但是,我正在尝试使用 SHAP 来解释模型预测。但是,我不断收到以下错误。我正在学习教程 here
import shap
explainer = shap.Explainer(rf_boruta) #pass my model
shap_values = explainer(ord_test_t) #pass my test dataset
sample_idx = 15
shap_vals = explainer.shap_values(ord_test_t.iloc[sample_idx:sample_idx+1])
print("Base Value : ", explainer.expected_value)
print()
print("Shap Values for Sample %d : "%sample_idx, shap_vals)
print("\n")
print("Prediction From Model : ", rf_boruta.predict(ord_test_t.iloc[15:16]))
print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + shap_vals.sum())
我收到如下所示的错误
> 8 print("\n")
> 9 print("Prediction From Model : ", rf_boruta.predict(ord_test_t.iloc[sample_idx:sample_idx+1]))
> ---> 10 print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + shap_vals.sum())
>
> AttributeError: 'list' object has no attribute 'sum'
当我尝试另一个教程时 here,我又遇到了一个错误
explainer = shap.TreeExplainer(rf_boruta,ord_test_t)
shap_values = explainer.shap_values(ord_test_t)
sample_ind = 0
shap.waterfall_plot(explainer.expected_value, shap_values[sample_ind],ord_test_t.iloc[sample_ind])
TypeError: waterfall() got multiple values for argument 'max_display'
稍后,当我将其更改为默认值时,出现另一个错误,如下所示
> ---> 46 base_values = shap_values.base_values
> 47
> 48 features = shap_values.data
>
> AttributeError: 'numpy.ndarray' object has no attribute 'base_values'
更新 - 尝试了另一个代码
row_to_show = 5
data_for_prediction = ord_test_t.iloc[row_to_show] # use 1 row of data here. Could use multiple rows if desired
data_for_prediction_array = data_for_prediction.values.reshape(1, -1)
rf_boruta.predict_proba(data_for_prediction_array)
explainer = shap.TreeExplainer(rf_boruta)
# Calculate Shap values
shap_values = explainer.shap_values(data_for_prediction)
shap.waterfall_plot(explainer.expected_value,shap_values,data_for_prediction)
错误回调告诉你问题是什么:
10 print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + shap_vals.sum())
AttributeError: 'list' object has no attribute 'sum'
因此,不要在列表中调用 shap_vals.sum()
,而是以 支持的方式获取总和,例如使用 built-in sum
函数:
print("Prediction From Adding SHAP Values to Base Value : ", explainer.expected_value + sum(shap_vals))