BertForSequenceClassification Class 和我的自定义 Bert Classification 之间的指标不匹配
Metrics mismatch between BertForSequenceClassification Class and my custom Bert Classification
我通过在 Bert 模型(附在下面)之上添加一个 classifier 层来实现我的自定义 Bert 二元分类模型 class。但是,当我使用官方 BertForSequenceClassification 模型进行训练时,accuracy/metrics 有很大不同,这让我想知道我是否在我的 class.
中遗漏了一些东西
我有几个疑问:
在加载官方 BertForSequenceClassification
from_pretrained
时,classifier 权重是否也从预训练模型中初始化,或者它们是随机初始化的?因为在我的习惯中 class 它们是随机初始化的。
class MyCustomBertClassification(nn.Module):
def __init__(self, encoder='bert-base-uncased',
num_labels,
hidden_dropout_prob):
super(MyCustomBertClassification, self).__init__()
self.config = AutoConfig.from_pretrained(encoder)
self.encoder = AutoModel.from_config(self.config)
self.dropout = nn.Dropout(hidden_dropout_prob)
self.classifier = nn.Linear(self.config.hidden_size, num_labels)
def forward(self, input_sent):
outputs = self.encoder(input_ids=input_sent['input_ids'],
attention_mask=input_sent['attention_mask'],
token_type_ids=input_sent['token_type_ids'],
return_dict=True)
pooled_output = self.dropout(outputs[1])
# for both tasks
logits = self.classifier(pooled_output)
return logits
当您使用 from_pretrained:
方法时,每个模型都会通过警告消息告诉您哪些层是随机初始化的
from transformers import BertForSequenceClassification
b = BertForSequenceClassification.from_pretrained('bert-base-uncased')
输出:
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
您的实现与 BertForSequenceClassification
的区别在于您根本不使用任何预训练权重。方法 from_config 不会从 state_dict
:
加载预训练权重
import torch
from transformers import AutoModelForSequenceClassification, AutoConfig
b2 = AutoModelForSequenceClassification.from_config(AutoConfig.from_pretrained('bert-base-uncased'))
b3 = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')
print("Does from_config provides pretrained weights: {}".format(torch.equal(b.bert.embeddings.word_embeddings.weight, b2.base_model.embeddings.word_embeddings.weight)))
print("Does from_pretrained provides pretrained weights: {}".format(torch.equal(b.bert.embeddings.word_embeddings.weight, b3.base_model.embeddings.word_embeddings.weight)))
输出:
Does from_config provides pretrained weights: False
Does from_pretrained provides pretrained weights: True
因此您可能想将 class 更改为:
class MyCustomBertClassification(nn.Module):
def __init__(self, encoder='bert-base-uncased',
num_labels=2,
hidden_dropout_prob=0.1):
super(MyCustomBertClassification, self).__init__()
self.config = AutoConfig.from_pretrained(encoder)
self.encoder = AutoModel.from_pretrained(encoder)
self.dropout = nn.Dropout(hidden_dropout_prob)
self.classifier = nn.Linear(self.config.hidden_size, num_labels)
def forward(self, input_sent):
outputs = self.encoder(input_ids=input_sent['input_ids'],
attention_mask=input_sent['attention_mask'],
token_type_ids=input_sent['token_type_ids'],
return_dict=True)
pooled_output = self.dropout(outputs[1])
# for both tasks
logits = self.classifier(pooled_output)
return logits
myB = MyCustomBertClassification()
print(torch.equal(b.bert.embeddings.word_embeddings.weight, myB.encoder.embeddings.word_embeddings.weight))
输出:
True
我通过在 Bert 模型(附在下面)之上添加一个 classifier 层来实现我的自定义 Bert 二元分类模型 class。但是,当我使用官方 BertForSequenceClassification 模型进行训练时,accuracy/metrics 有很大不同,这让我想知道我是否在我的 class.
中遗漏了一些东西我有几个疑问:
在加载官方 BertForSequenceClassification
from_pretrained
时,classifier 权重是否也从预训练模型中初始化,或者它们是随机初始化的?因为在我的习惯中 class 它们是随机初始化的。
class MyCustomBertClassification(nn.Module):
def __init__(self, encoder='bert-base-uncased',
num_labels,
hidden_dropout_prob):
super(MyCustomBertClassification, self).__init__()
self.config = AutoConfig.from_pretrained(encoder)
self.encoder = AutoModel.from_config(self.config)
self.dropout = nn.Dropout(hidden_dropout_prob)
self.classifier = nn.Linear(self.config.hidden_size, num_labels)
def forward(self, input_sent):
outputs = self.encoder(input_ids=input_sent['input_ids'],
attention_mask=input_sent['attention_mask'],
token_type_ids=input_sent['token_type_ids'],
return_dict=True)
pooled_output = self.dropout(outputs[1])
# for both tasks
logits = self.classifier(pooled_output)
return logits
当您使用 from_pretrained:
方法时,每个模型都会通过警告消息告诉您哪些层是随机初始化的from transformers import BertForSequenceClassification
b = BertForSequenceClassification.from_pretrained('bert-base-uncased')
输出:
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
您的实现与 BertForSequenceClassification
的区别在于您根本不使用任何预训练权重。方法 from_config 不会从 state_dict
:
import torch
from transformers import AutoModelForSequenceClassification, AutoConfig
b2 = AutoModelForSequenceClassification.from_config(AutoConfig.from_pretrained('bert-base-uncased'))
b3 = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')
print("Does from_config provides pretrained weights: {}".format(torch.equal(b.bert.embeddings.word_embeddings.weight, b2.base_model.embeddings.word_embeddings.weight)))
print("Does from_pretrained provides pretrained weights: {}".format(torch.equal(b.bert.embeddings.word_embeddings.weight, b3.base_model.embeddings.word_embeddings.weight)))
输出:
Does from_config provides pretrained weights: False
Does from_pretrained provides pretrained weights: True
因此您可能想将 class 更改为:
class MyCustomBertClassification(nn.Module):
def __init__(self, encoder='bert-base-uncased',
num_labels=2,
hidden_dropout_prob=0.1):
super(MyCustomBertClassification, self).__init__()
self.config = AutoConfig.from_pretrained(encoder)
self.encoder = AutoModel.from_pretrained(encoder)
self.dropout = nn.Dropout(hidden_dropout_prob)
self.classifier = nn.Linear(self.config.hidden_size, num_labels)
def forward(self, input_sent):
outputs = self.encoder(input_ids=input_sent['input_ids'],
attention_mask=input_sent['attention_mask'],
token_type_ids=input_sent['token_type_ids'],
return_dict=True)
pooled_output = self.dropout(outputs[1])
# for both tasks
logits = self.classifier(pooled_output)
return logits
myB = MyCustomBertClassification()
print(torch.equal(b.bert.embeddings.word_embeddings.weight, myB.encoder.embeddings.word_embeddings.weight))
输出:
True