无法导入 HuggingFace SciBert AutoModelForMaskedLM

HuggingFace SciBert AutoModelForMaskedLM cannot be imported

我正在尝试使用预训练的 SciBERT 模型 (https://huggingface.co/allenai/scibert_scivocab_uncased) from Huggingface to evaluate masked words in scientific/biomedical text for bias using CrowS-Pairs (https://github.com/nyu-mll/crows-pairs/)。 CrowS-Pairs 代码与 BERT 等内置模型配合得很好。

我修改了 metric.py 的代码,目的是允许选择使用 SciBERT 模型 -

import os
import csv
import json
import math
import torch
import argparse
import difflib
import logging
import numpy as np
import pandas as pd

from transformers import BertTokenizer, BertForMaskedLM
from transformers import AlbertTokenizer, AlbertForMaskedLM
from transformers import RobertaTokenizer, RobertaForMaskedLM
from transformers import AutoTokenizer, AutoModelForMaskedLM

并得到以下错误

2021-06-21 17:11:38.626413: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
  File "metric.py", line 15, in <module>
    from transformers import AutoTokenizer, AutoModelForMaskedLM
ImportError: cannot import name 'AutoModelForMaskedLM' from 'transformers' (/usr/local/lib/python3.7/dist-packages/transformers/__init__.py)

稍后在 Python 文件中,AutoTokenizer 和 AutoModelForMaskedLM 定义为

tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")
model = AutoModelForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased") 

图书馆

huggingface-hub-0.0.8
sacremoses-0.0.45
tokenizers-0.10.3
transformers-4.7.0 

有无 GPU 支持都会出现错误。

试试这个:

tokenizer = BertTokenizer.from_pretrained("allenai/scibert_scivocab_uncased", do_lower_case=True)

model = BertForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased")