入门:Huggingface 模型卡片
Getting started: Huggingface Model Cards
我最近才开始研究 huggingface 转换器库。
当我尝试开始使用模型卡代码时,例如community model
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
model = AutoModel.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
但是,我收到以下错误:
Traceback (most recent call last):
File "test.py", line 2, in <module>
tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
File "/Users/Lukas/miniconda3/envs/nlp/lib/python3.7/site-packages/transformers/tokenization_auto.py", line 124, in from_pretrained
"'xlm', 'roberta', 'ctrl'".format(pretrained_model_name_or_path))
ValueError: Unrecognized model identifier in emilyalsentzer/Bio_ClinicalBERT. Should contains one of 'bert', 'openai-gpt', 'gpt2', 'transfo-xl', 'xlnet', 'xlm', 'roberta', 'ctrl'
如果我尝试使用不同的分词器,例如 "baykenney/bert-base-gpt2detector-topp92",我会收到以下错误:
OSError: Model name 'baykenney/bert-base-gpt2detector-topp92' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased). We assumed 'baykenney/bert-base-gpt2detector-topp92' was a path or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.
我是否遗漏了任何开始的内容?感觉模型卡说明这三行代码应该够入门了。
我正在使用 Python 3.7 和转换器库版本 2.1.1 和 pytorch 1.5。
请将您的变形金刚库至少更新到 2.4.0。您应该创建一个新的 conda 环境并使用 pip 直接从 pypi 安装所有包以获取最新版本(当前为 2.11.0)。
我最近才开始研究 huggingface 转换器库。 当我尝试开始使用模型卡代码时,例如community model
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
model = AutoModel.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
但是,我收到以下错误:
Traceback (most recent call last):
File "test.py", line 2, in <module>
tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
File "/Users/Lukas/miniconda3/envs/nlp/lib/python3.7/site-packages/transformers/tokenization_auto.py", line 124, in from_pretrained
"'xlm', 'roberta', 'ctrl'".format(pretrained_model_name_or_path))
ValueError: Unrecognized model identifier in emilyalsentzer/Bio_ClinicalBERT. Should contains one of 'bert', 'openai-gpt', 'gpt2', 'transfo-xl', 'xlnet', 'xlm', 'roberta', 'ctrl'
如果我尝试使用不同的分词器,例如 "baykenney/bert-base-gpt2detector-topp92",我会收到以下错误:
OSError: Model name 'baykenney/bert-base-gpt2detector-topp92' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased). We assumed 'baykenney/bert-base-gpt2detector-topp92' was a path or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.
我是否遗漏了任何开始的内容?感觉模型卡说明这三行代码应该够入门了。
我正在使用 Python 3.7 和转换器库版本 2.1.1 和 pytorch 1.5。
请将您的变形金刚库至少更新到 2.4.0。您应该创建一个新的 conda 环境并使用 pip 直接从 pypi 安装所有包以获取最新版本(当前为 2.11.0)。