更改预训练拥抱面模型的最后一层

Change last layer on pretrained huggingface model

我想重新微调 transformer 模型,但在尝试训练模型时出现未知错误。 我无法在加载模型时更改“num_labels”。 所以,我尝试手动更改它

model_name = "mrm8488/flaubert-small-finetuned-movie-review-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name).to('cuda')

num_labels = 3
model.sequence_summary.summary = torch.nn.Linear(in_features=model.sequence_summary.summary.in_features, out_features=num_labels, bias=True)




trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train['train'],
    eval_dataset=tokenized_test['train'],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    #data_collator=data_collator,
)

trainer.train()

错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-93-8139f38c5ec6> in <module>()
     20 )
     21 
---> 22 trainer.train()

7 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   2844     if size_average is not None or reduce is not None:
   2845         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
   2847 
   2848 

ValueError: Expected input batch_size (24) to match target batch_size (16).

所以,有一个解决方案 加载模型时只需添加 ignore_mismatched_sizes=True 为:

model = AutoModelForSequenceClassification.from_pretrained(model_name,num_labels=3, ignore_mismatched_sizes=True).to('cuda')

这是您可以尝试的另一个示例

import torch
import torch.nn as nn

class CustomHFClassifier(nn.Module):
    def __init__(self, num_class):
       super(CustomHFClassifier, self).__init__()
       self.model = AutoModel.from_pretrained('bert-base-cased')

       self.dropout = nn.Dropout(0.25)
       self.classifier = nn.Linear(768, n_classes)

    def forward(self, input_ids, attention_mask, token_type_ids):
       embeddings = self.model(
            input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids
        )[1]
        
        embeddings = self.dropout(embeddings)
        embeddings = self.classifier(embeddings)
        return embeddings

model = CustomHFClassifier(10)