将一层的权重从一个 Huggingface BERT 模型复制到另一个
Copy one layer's weights from one Huggingface BERT model to another
我有一个预训练模型,我是这样加载的:
from transformers import BertForSequenceClassification, AdamW, BertConfig, BertModel
model = BertForSequenceClassification.from_pretrained(
"bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
num_labels = 2, # The number of output labels--2 for binary classification.
# You can increase this for multi-class tasks.
output_attentions = False, # Whether the model returns attentions weights.
output_hidden_states = False, # Whether the model returns all hidden-states.
)
我想创建一个具有相同架构和随机初始权重的新模型,嵌入层除外:
==== Embedding Layer ====
bert.embeddings.word_embeddings.weight (30522, 768)
bert.embeddings.position_embeddings.weight (512, 768)
bert.embeddings.token_type_embeddings.weight (2, 768)
bert.embeddings.LayerNorm.weight (768,)
bert.embeddings.LayerNorm.bias (768,)
看来我可以这样做来创建一个具有相同架构的新模型,但是所有权重都是随机的:
configuration = model.config
untrained_model = BertForSequenceClassification(configuration)
那么如何将 model
的嵌入层权重复制到新的 untrained_model
?
权重和偏差只是张量,您可以简单地用 copy_:
复制它们
from transformers import BertForSequenceClassification, BertConfig
jetfire = BertForSequenceClassification.from_pretrained('bert-base-cased')
config = BertConfig.from_pretrained('bert-base-cased')
optimus = BertForSequenceClassification(config)
parts = ['bert.embeddings.word_embeddings.weight'
,'bert.embeddings.position_embeddings.weight'
,'bert.embeddings.token_type_embeddings.weight'
,'bert.embeddings.LayerNorm.weight'
,'bert.embeddings.LayerNorm.bias']
def joltElectrify (jetfire, optimus, parts):
target = dict(optimus.named_parameters())
source = dict(jetfire.named_parameters())
for part in parts:
target[part].data.copy_(source[part].data)
joltElectrify(jetfire, optimus, parts)
我有一个预训练模型,我是这样加载的:
from transformers import BertForSequenceClassification, AdamW, BertConfig, BertModel
model = BertForSequenceClassification.from_pretrained(
"bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
num_labels = 2, # The number of output labels--2 for binary classification.
# You can increase this for multi-class tasks.
output_attentions = False, # Whether the model returns attentions weights.
output_hidden_states = False, # Whether the model returns all hidden-states.
)
我想创建一个具有相同架构和随机初始权重的新模型,嵌入层除外:
==== Embedding Layer ====
bert.embeddings.word_embeddings.weight (30522, 768)
bert.embeddings.position_embeddings.weight (512, 768)
bert.embeddings.token_type_embeddings.weight (2, 768)
bert.embeddings.LayerNorm.weight (768,)
bert.embeddings.LayerNorm.bias (768,)
看来我可以这样做来创建一个具有相同架构的新模型,但是所有权重都是随机的:
configuration = model.config
untrained_model = BertForSequenceClassification(configuration)
那么如何将 model
的嵌入层权重复制到新的 untrained_model
?
权重和偏差只是张量,您可以简单地用 copy_:
复制它们from transformers import BertForSequenceClassification, BertConfig
jetfire = BertForSequenceClassification.from_pretrained('bert-base-cased')
config = BertConfig.from_pretrained('bert-base-cased')
optimus = BertForSequenceClassification(config)
parts = ['bert.embeddings.word_embeddings.weight'
,'bert.embeddings.position_embeddings.weight'
,'bert.embeddings.token_type_embeddings.weight'
,'bert.embeddings.LayerNorm.weight'
,'bert.embeddings.LayerNorm.bias']
def joltElectrify (jetfire, optimus, parts):
target = dict(optimus.named_parameters())
source = dict(jetfire.named_parameters())
for part in parts:
target[part].data.copy_(source[part].data)
joltElectrify(jetfire, optimus, parts)