在 xgboost 上使用 shap 时出现 UnicodeDecodeError
Getting UnicodeDecodeError when using shap on xgboost
我正在尝试在 xgboost 模型上使用 shap,但出现错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 341: invalid start byte
示例:
model = XGBClassifier()
model.fit(X_train, y_train)
explainer = shap.TreeExplainer(model)
包版本:
python == 3.6.9
xgboost==1.1.0
shap==0.35.0
问题是什么,我们该如何解决?
系统中似乎存在错误。参见:https://github.com/slundberg/shap/issues/1215。
该问题似乎已解决,但可能尚未发布修复程序。无论如何,我遇到了同样的问题并通过安装 xgboost v1.0.0 暂时解决了它。
我尝试了以下解决方案并且有效。
package versions:
python == 3.7.7
xgboost==1.1.1
shap==0.35.0
代码对我来说效果很好。
import shap
from xgboost.sklearn import XGBClassifier
xgb = XGBClassifier(random_state=42)
mymodel = xgb.fit(X_train, y_train)
The part that really solves them is this, must not miss
mybooster = mymodel.get_booster()
model_bytearray = mybooster.save_raw()[4:]
def myfun(self=None):
return model_bytearray
mybooster.save_raw = myfun
# Shap explainer initilization
shap_ex = shap.TreeExplainer(mybooster)
我遇到了与 xgboost-1.2.0
和 shap 0.35.0
相同的问题。
这是我能够 运行 没有问题的完整示例:
import numpy as np
import xgboost as xgb
import shap
# data
np.random.seed(100)
X_train = np.random.random((100, 10))
y_train = np.random.randint(2, size=100)
# model
model = xgb.XGBClassifier(random_state=42)
fitted_model = model.fit(X_train, y_train)
# monkey patch
booster = fitted_model.get_booster()
model_bytearray = booster.save_raw()[4:]
booster.save_raw = lambda : model_bytearray
# shap expaliner
explainer = shap.TreeExplainer(booster)
shap_values = explainer.shap_values(X_train)
shap.summary_plot(shap_values, X_train)
输出
!pip install shap==0.36.0
!pip install xgboost==1.3.3
这对我很有效
我正在尝试在 xgboost 模型上使用 shap,但出现错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 341: invalid start byte
示例:
model = XGBClassifier()
model.fit(X_train, y_train)
explainer = shap.TreeExplainer(model)
包版本:
python == 3.6.9
xgboost==1.1.0
shap==0.35.0
问题是什么,我们该如何解决?
系统中似乎存在错误。参见:https://github.com/slundberg/shap/issues/1215。 该问题似乎已解决,但可能尚未发布修复程序。无论如何,我遇到了同样的问题并通过安装 xgboost v1.0.0 暂时解决了它。
我尝试了以下解决方案并且有效。
package versions:
python == 3.7.7
xgboost==1.1.1
shap==0.35.0
代码对我来说效果很好。
import shap
from xgboost.sklearn import XGBClassifier
xgb = XGBClassifier(random_state=42)
mymodel = xgb.fit(X_train, y_train)
The part that really solves them is this, must not miss
mybooster = mymodel.get_booster()
model_bytearray = mybooster.save_raw()[4:]
def myfun(self=None):
return model_bytearray
mybooster.save_raw = myfun
# Shap explainer initilization
shap_ex = shap.TreeExplainer(mybooster)
我遇到了与 xgboost-1.2.0
和 shap 0.35.0
相同的问题。
这是我能够 运行 没有问题的完整示例:
import numpy as np
import xgboost as xgb
import shap
# data
np.random.seed(100)
X_train = np.random.random((100, 10))
y_train = np.random.randint(2, size=100)
# model
model = xgb.XGBClassifier(random_state=42)
fitted_model = model.fit(X_train, y_train)
# monkey patch
booster = fitted_model.get_booster()
model_bytearray = booster.save_raw()[4:]
booster.save_raw = lambda : model_bytearray
# shap expaliner
explainer = shap.TreeExplainer(booster)
shap_values = explainer.shap_values(X_train)
shap.summary_plot(shap_values, X_train)
输出
!pip install shap==0.36.0
!pip install xgboost==1.3.3
这对我很有效