访问多层预训练 DistilBERT 模型的输出

Question

我正在尝试从预训练的 "DistilBERT" 模型的几个不同层访问输出嵌入。 ("distilbert-base-uncased")

bert_output = model(input_ids, attention_mask=attention_mask)

bert_output 似乎 return 只有输入标记的最后一层的嵌入值。

Answer 1

如果你想获得所有隐藏层的输出，你需要在你的配置中添加 output_hidden_states=True kwarg。

您的代码将类似于

from transformers import DistilBertModel, DistilBertConfig

config = DistilBertConfig.from_pretrained('distilbert-base-cased', output_hidden_states=True)
model = DistilBertModel.from_pretrained('distilbert-base-cased', config=config)

隐藏层将作为 bert_output[2]

提供

访问多层预训练 DistilBERT 模型的输出

Access the output of several layers of pretrained DistilBERT model

python

nlp

pytorch

bert-language-model

huggingface-transformers