BertModel 转换器输出字符串而不是张量
BertModel transformers outputs string instead of tensor
我正在关注 this tutorial that codes a sentiment analysis classifier using BERT with the huggingface 图书馆,但我的行为很奇怪。当使用示例文本尝试 BERT 模型时,我得到一个字符串而不是隐藏状态。这是我正在使用的代码:
import transformers
from transformers import BertModel, BertTokenizer
print(transformers.__version__)
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
PATH_OF_CACHE = "/home/mwon/data-mwon/paperChega/src_classificador/data/hugingface"
tokenizer = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
sample_txt = 'When was I last outside? I am stuck at home for 2 weeks.'
encoding_sample = tokenizer.encode_plus(
sample_txt,
max_length=32,
add_special_tokens=True, # Add '[CLS]' and '[SEP]'
return_token_type_ids=False,
padding=True,
truncation = True,
return_attention_mask=True,
return_tensors='pt', # Return PyTorch tensors
)
bert_model = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
last_hidden_state, pooled_output = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print([last_hidden_state,pooled_output])
输出:
4.0.0
['last_hidden_state', 'pooler_output']
我在学习如何实施 Bert 时遇到了同样的问题。我注意到使用
last_hidden_state, pooled_output = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
是问题所在。使用:
outputs = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
并使用
提取last_hidden状态
output[0]
您可以参考文档 here,它告诉您 BertModel
返回的内容
虽然 的回答提供了问题的解决方案,但没有解释问题。由于 transformers 库的 3.X 版本之一,模型不再 return 元组而是特定的输出对象:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print(type(o))
print(o.keys())
输出:
transformers.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
odict_keys(['last_hidden_state', 'pooler_output'])
您可以 return 通过添加 return_dict=False
来获取元组来 return 之前的行为:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask'],
return_dict=False
)
print(type(o))
输出:
<class 'tuple'>
我不建议这样做,因为现在 select 输出的特定部分是明确的,无需转向文档,如下例所示:
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], return_dict=False, output_attentions=True, output_hidden_states=True)
print('I am a tuple with {} elements. You do not know what each element presents without checking the documentation'.format(len(o)))
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], output_attentions=True, output_hidden_states=True)
print('I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; {} '.format(o.keys()))
输出:
I am a tuple with 4 elements. You do not know what each element presents without checking the documentation
I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; odict_keys(['last_hidden_state', 'pooler_output', 'hidden_states', 'attentions'])
我正在关注 this tutorial that codes a sentiment analysis classifier using BERT with the huggingface 图书馆,但我的行为很奇怪。当使用示例文本尝试 BERT 模型时,我得到一个字符串而不是隐藏状态。这是我正在使用的代码:
import transformers
from transformers import BertModel, BertTokenizer
print(transformers.__version__)
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
PATH_OF_CACHE = "/home/mwon/data-mwon/paperChega/src_classificador/data/hugingface"
tokenizer = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
sample_txt = 'When was I last outside? I am stuck at home for 2 weeks.'
encoding_sample = tokenizer.encode_plus(
sample_txt,
max_length=32,
add_special_tokens=True, # Add '[CLS]' and '[SEP]'
return_token_type_ids=False,
padding=True,
truncation = True,
return_attention_mask=True,
return_tensors='pt', # Return PyTorch tensors
)
bert_model = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
last_hidden_state, pooled_output = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print([last_hidden_state,pooled_output])
输出:
4.0.0
['last_hidden_state', 'pooler_output']
我在学习如何实施 Bert 时遇到了同样的问题。我注意到使用
last_hidden_state, pooled_output = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
是问题所在。使用:
outputs = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
并使用
提取last_hidden状态output[0]
您可以参考文档 here,它告诉您 BertModel
返回的内容虽然
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print(type(o))
print(o.keys())
输出:
transformers.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
odict_keys(['last_hidden_state', 'pooler_output'])
您可以 return 通过添加 return_dict=False
来获取元组来 return 之前的行为:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask'],
return_dict=False
)
print(type(o))
输出:
<class 'tuple'>
我不建议这样做,因为现在 select 输出的特定部分是明确的,无需转向文档,如下例所示:
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], return_dict=False, output_attentions=True, output_hidden_states=True)
print('I am a tuple with {} elements. You do not know what each element presents without checking the documentation'.format(len(o)))
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], output_attentions=True, output_hidden_states=True)
print('I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; {} '.format(o.keys()))
输出:
I am a tuple with 4 elements. You do not know what each element presents without checking the documentation
I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; odict_keys(['last_hidden_state', 'pooler_output', 'hidden_states', 'attentions'])