在 Hugging Face BertForSequenceClassification 中有 6 个标签而不是 2 个

Question

我只是想知道是否可以将 HuggingFace BertForSequenceClassification 模型扩展到 2 个以上的标签。文档说，我们可以传递位置参数，但 "labels" 似乎不起作用。有人有想法吗？

模型分配

labels = th.tensor([0,0,0,0,0,0], dtype=th.long).unsqueeze(0)
print(labels.shape)
modelBERTClass = transformers.BertForSequenceClassification.from_pretrained(
    'bert-base-uncased', 
    labels=labels
    )

l = [module for module in modelBERTClass.modules()]
l

控制台输出

torch.Size([1, 6])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-122-fea9a36402a6> in <module>()
      3 modelBERTClass = transformers.BertForSequenceClassification.from_pretrained(
      4     'bert-base-uncased',
----> 5     labels=labels
      6     )
      7 

/usr/local/lib/python3.6/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    653 
    654         # Instantiate model.
--> 655         model = cls(config, *model_args, **model_kwargs)
    656 
    657         if state_dict is None and not from_tf:

TypeError: __init__() got an unexpected keyword argument 'labels'

Answer 1

当您使用 .from_pretrained 加载模型时，它将使用此配置的默认值。在 bert-base-uncased 的情况下，由于 config.num_labels 的值，您的模型仅支持两个不同的标签：

from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
model.parameters

输出：

...
  (classifier): Linear(in_features=768, out_features=2, bias=True)

您可以通过修改 BertConfig:

轻松修改此值

from transformers import BertForSequenceClassification, BertConfig

config = BertConfig.from_pretrained('bert-base-uncased')
config.num_labels = 6
model = BertForSequenceClassification(config) 
model.parameters

输出：

...
 (classifier): Linear(in_features=768, out_features=6, bias=True)

在 Hugging Face BertForSequenceClassification 中有 6 个标签而不是 2 个

Having 6 labels instead of 2 in Hugging Face BertForSequenceClassification

python

transformer

bert-language-model

huggingface-transformers

模型分配

控制台输出