更改预训练拥抱面模型的最后一层
Change last layer on pretrained huggingface model
我想重新微调 transformer 模型,但在尝试训练模型时出现未知错误。
我无法在加载模型时更改“num_labels”。
所以,我尝试手动更改它
model_name = "mrm8488/flaubert-small-finetuned-movie-review-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name).to('cuda')
num_labels = 3
model.sequence_summary.summary = torch.nn.Linear(in_features=model.sequence_summary.summary.in_features, out_features=num_labels, bias=True)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_train['train'],
eval_dataset=tokenized_test['train'],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
#data_collator=data_collator,
)
trainer.train()
错误
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-93-8139f38c5ec6> in <module>()
20 )
21
---> 22 trainer.train()
7 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
2844 if size_average is not None or reduce is not None:
2845 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
2847
2848
ValueError: Expected input batch_size (24) to match target batch_size (16).
所以,有一个解决方案
加载模型时只需添加 ignore_mismatched_sizes=True
为:
model = AutoModelForSequenceClassification.from_pretrained(model_name,num_labels=3, ignore_mismatched_sizes=True).to('cuda')
这是您可以尝试的另一个示例
import torch
import torch.nn as nn
class CustomHFClassifier(nn.Module):
def __init__(self, num_class):
super(CustomHFClassifier, self).__init__()
self.model = AutoModel.from_pretrained('bert-base-cased')
self.dropout = nn.Dropout(0.25)
self.classifier = nn.Linear(768, n_classes)
def forward(self, input_ids, attention_mask, token_type_ids):
embeddings = self.model(
input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids
)[1]
embeddings = self.dropout(embeddings)
embeddings = self.classifier(embeddings)
return embeddings
model = CustomHFClassifier(10)
我想重新微调 transformer 模型,但在尝试训练模型时出现未知错误。 我无法在加载模型时更改“num_labels”。 所以,我尝试手动更改它
model_name = "mrm8488/flaubert-small-finetuned-movie-review-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name).to('cuda')
num_labels = 3
model.sequence_summary.summary = torch.nn.Linear(in_features=model.sequence_summary.summary.in_features, out_features=num_labels, bias=True)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_train['train'],
eval_dataset=tokenized_test['train'],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
#data_collator=data_collator,
)
trainer.train()
错误
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-93-8139f38c5ec6> in <module>()
20 )
21
---> 22 trainer.train()
7 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
2844 if size_average is not None or reduce is not None:
2845 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
2847
2848
ValueError: Expected input batch_size (24) to match target batch_size (16).
所以,有一个解决方案
加载模型时只需添加 ignore_mismatched_sizes=True
为:
model = AutoModelForSequenceClassification.from_pretrained(model_name,num_labels=3, ignore_mismatched_sizes=True).to('cuda')
这是您可以尝试的另一个示例
import torch
import torch.nn as nn
class CustomHFClassifier(nn.Module):
def __init__(self, num_class):
super(CustomHFClassifier, self).__init__()
self.model = AutoModel.from_pretrained('bert-base-cased')
self.dropout = nn.Dropout(0.25)
self.classifier = nn.Linear(768, n_classes)
def forward(self, input_ids, attention_mask, token_type_ids):
embeddings = self.model(
input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids
)[1]
embeddings = self.dropout(embeddings)
embeddings = self.classifier(embeddings)
return embeddings
model = CustomHFClassifier(10)