在 HuggingFace Transformers 模型上使用量化
Use Quantization on HuggingFace Transformers models
我正在学习 量化,并且正在试验 第 1 部分 notebook。
我想在我自己的模型上使用此代码。
假设我只需要在第1.2节
中分配给model
变量
# load model
model = BertForSequenceClassification.from_pretrained(configs.output_dir)
model.to(configs.device)
我的模型来自不同的库:from transformers import pipeline
。所以 .to()
抛出一个 AttributeError
.
我的模特:
pip install transformers
from transformers import pipeline
unmasker = pipeline('fill-mask', model='bert-base-uncased')
model = unmasker("Hello I'm a [MASK] model.")
输出:
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
我如何 运行 我的示例模型上的链接量化代码?
请让我知道是否还有其他我需要澄清的地方 post。
pipeline
方法不适用于量化,因为我们需要返回模型。但是,您可以使用 pipeline
来测试原始 models 的计时等
量化码:
token_logits
包含量化模型的张量。
您可以在此代码周围放置一个 for-loop
,并将 model_name
替换为 list
中的 string
。
model_name = bert-base-uncased
tokenizer = AutoTokenizer.from_pretrained(model_name )
model = AutoModelForMaskedLM.from_pretrained(model_name)
sequence = "Distilled models are smaller than the models they mimic. Using them instead of the large " \
f"versions would help {tokenizer.mask_token} our carbon footprint."
inputs = tokenizer(sequence, return_tensors="pt")
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
token_logits = model(**inputs).logits
# <- can stop here
我正在学习 量化,并且正在试验 第 1 部分 notebook。
我想在我自己的模型上使用此代码。
假设我只需要在第1.2节
中分配给model
变量
# load model
model = BertForSequenceClassification.from_pretrained(configs.output_dir)
model.to(configs.device)
我的模型来自不同的库:from transformers import pipeline
。所以 .to()
抛出一个 AttributeError
.
我的模特:
pip install transformers
from transformers import pipeline
unmasker = pipeline('fill-mask', model='bert-base-uncased')
model = unmasker("Hello I'm a [MASK] model.")
输出:
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
我如何 运行 我的示例模型上的链接量化代码?
请让我知道是否还有其他我需要澄清的地方 post。
pipeline
方法不适用于量化,因为我们需要返回模型。但是,您可以使用 pipeline
来测试原始 models 的计时等
量化码:
token_logits
包含量化模型的张量。
您可以在此代码周围放置一个 for-loop
,并将 model_name
替换为 list
中的 string
。
model_name = bert-base-uncased
tokenizer = AutoTokenizer.from_pretrained(model_name )
model = AutoModelForMaskedLM.from_pretrained(model_name)
sequence = "Distilled models are smaller than the models they mimic. Using them instead of the large " \
f"versions would help {tokenizer.mask_token} our carbon footprint."
inputs = tokenizer(sequence, return_tensors="pt")
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
token_logits = model(**inputs).logits
# <- can stop here