如何获取 Huggingface Transformer 模型预测 [零样本分类] 的 SHAP 值？

Question

通过 Huggingface 给出一个零样本分类任务如下：

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

example_text = "This is an example text about snowflakes in the summer"
labels = ["weather", "sports", "computer industry"]
        
output = classifier(example_text, labels, multi_label=True)
output 
{'sequence': 'This is an example text about snowflakes in the summer',
'labels': ['weather', 'sports'],
'scores': [0.9780895709991455, 0.021910419687628746]}

我正在尝试提取 SHAP 值以为预测结果生成基于文本的解释，如下所示：SHAP for Transformers

我已经根据上面的url尝试了以下方法：

from transformers import AutoModelForSequenceClassification, AutoTokenizer, ZeroShotClassificationPipeline

model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')

pipe = ZeroShotClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)

def score_and_visualize(text):
    prediction = pipe([text])
    print(prediction[0])

    explainer = shap.Explainer(pipe)
    shap_values = explainer([text])

    shap.plots.text(shap_values)

score_and_visualize(example_text)

有什么建议吗？提前感谢您的帮助！

作为上述管道的替代方案，以下也适用：

from transformers import AutoModelForSequenceClassification, AutoTokenizer, ZeroShotClassificationPipeline

model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')

classifier = ZeroShotClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)

example_text = "This is an example text about snowflakes in the summer"
labels = ["weather", "sports"]

output = classifier(example_text, labels)
output 
{'sequence': 'This is an example text about snowflakes in the summer',
'labels': ['weather', 'sports'],
'scores': [0.9780895709991455, 0.021910419687628746]}

Answer 1

ZeroShotClassificationPipeline is currently not supported by shap，但您可以使用变通方法。解决方法是必需的，因为：

shap Explainer 仅将一个参数转发给模型（在本例中为管道），但 ZeroShotClassificationPipeline 需要两个参数，即文本和标签。
shap Explainer 将访问您模型的配置并使用其 label2id 和 id2label 属性。它们与从 ZeroShotClassificationPipeline 返回的标签不匹配，将导致错误。

下面是对一种可能的解决方法的建议。我建议在 shap 打开一个问题并请求对 huggingface 的 ZeroShotClassificationPipeline 的官方支持。

import shap
from transformers import AutoModelForSequenceClassification, AutoTokenizer, ZeroShotClassificationPipeline
from typing import Union, List

weights = "valhalla/distilbart-mnli-12-3"

model = AutoModelForSequenceClassification.from_pretrained(weights)
tokenizer = AutoTokenizer.from_pretrained(weights)

# Create your own pipeline that only requires the text parameter 
# for the __call__ method and provides a method to set the labels
class MyZeroShotClassificationPipeline(ZeroShotClassificationPipeline):
    # Overwrite the __call__ method
    def __call__(self, *args):
      o = super().__call__(args[0], self.workaround_labels)[0]

      return [[{"label":x[0], "score": x[1]}  for x in zip(o["labels"], o["scores"])]]

    def set_labels_workaround(self, labels: Union[str,List[str]]):
      self.workaround_labels = labels

example_text = "This is an example text about snowflakes in the summer"
labels = ["weather","sports"]

# In the following, we address issue 2.
model.config.label2id.update({v:k for k,v in enumerate(labels)})
model.config.id2label.update({k:v for k,v in enumerate(labels)})

pipe = MyZeroShotClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)
pipe.set_labels_workaround(labels)

def score_and_visualize(text):
    prediction = pipe([text])
    print(prediction[0])

    explainer = shap.Explainer(pipe)
    shap_values = explainer([text])

    shap.plots.text(shap_values)


score_and_visualize(example_text)

输出：

Answer 2

这是与@cronoik 讨论的跟进，这可能有助于其他人理解为什么修补 label2id 的魔法会起作用。

ZeroShotClassificationPipeline 状态的文档：

NLI-based zero-shot classification pipeline using a ModelForSequenceClassification trained on NLI (natural language inference) tasks.

Any combination of sequences and labels can be passed and each combination will be posed as a premise/hypothesis pair and passed to the pretrained model. Then, the logit for entailment is taken as the logit for the candidate label being valid. Any NLI model can be used, but the id of the entailment label must be included in the model config's ~transformers.PretrainedConfig.label2id.

这意味着（参见随附的源代码）：

通过 __call__ 方法提供的标签将被传递到基础训练模型（通过 label2id），并将在 premise/entailment 个句子对中进行尝试
如果您手动覆盖 label2id，应将 entailment 标签添加到 label2id（否则您会收到警告）。无需添加任何其他内容。

一旦满足这些条件，该模型将 return 所提供标签的字典 sigmoid/softmax logits of entailment in classification like

"<cls> sequence to classify <sep> This example is {label} . <sep>"

作为 label 的蕴含概率。

对于这种类型的分类器管道 label2id's 只是用作占位符来保留标签并将它们传递给管道的其他部分。

如何获取 Huggingface Transformer 模型预测 [零样本分类] 的 SHAP 值？

How to get SHAP values for Huggingface Transformer Model Prediction [Zero-Shot Classification]?

transformer

pytorch

shap

huggingface-transformers