After installing scrubadub_spacy package, spacy.load("en_core_web_sm") not working OSError: [E053] Could not read config.cfg

After installing scrubadub_spacy package, spacy.load("en_core_web_sm") not working OSError: [E053] Could not read config.cfg

当我尝试 运行 以下代码行以在 Azure 机器学习实例中加载 en_core_web_sm 时出现以下错误。

我调试了这个问题,发现一旦我安装 scrubadub_spacy,这似乎就是导致错误的问题。

spacy.load("en_core_web_sm")
OSError                                   Traceback (most recent call last)
<ipython-input-2-c6e652d70518> in <module>
     1 # Load English tokenizer, tagger, parser and NER
----> 2 nlp = spacy.load("en_core_web_sm")

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/__init__.py in load(name, vocab, disable, exclude, config)
    50     """
    51     return util.load_model(
---> 52         name, vocab=vocab, disable=disable, exclude=exclude, config=config
    53     )
    54 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model(name, vocab, disable, exclude, config)
   418             return get_lang_class(name.replace("blank:", ""))()
   419         if is_package(name):  # installed as package
--> 420             return load_model_from_package(name, **kwargs)  # type: ignore[arg-type]
   421         if Path(name).exists():  # path to model data directory
   422             return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_package(name, vocab, disable, exclude, config)
   451     """
   452     cls = importlib.import_module(name)
--> 453     return cls.load(vocab=vocab, disable=disable, exclude=exclude, config=config)  # type: ignore[attr-defined]
   454 
   455 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/en_core_web_sm/__init__.py in load(**overrides)
    10 
    11 def load(**overrides):
---> 12     return load_model_from_init_py(__file__, **overrides)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_init_py(init_file, vocab, disable, exclude, config)
   619         disable=disable,
   620         exclude=exclude,
--> 621         config=config,
   622     )
   623 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_path(model_path, meta, vocab, disable, exclude, config)
   485     config_path = model_path / "config.cfg"
   486     overrides = dict_to_dot(config)
--> 487     config = load_config(config_path, overrides=overrides)
   488     nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
   489     return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_config(path, overrides, interpolate)
   644     else:
   645         if not config_path or not config_path.exists() or not config_path.is_file():
--> 646             raise IOError(Errors.E053.format(path=config_path, name="config.cfg"))
   647         return config.from_disk(
   648             config_path, overrides=overrides, interpolate=interpolate

OSError: [E053] Could not read config.cfg from /anaconda/envs/azureml_py36/lib/python3.6/site-packages/en_core_web_sm/en_core_web_sm-2.3.1/config.cfg

我使用 Spacy

中的以下三行代码安装了软件包
pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm

我该如何解决这个问题?提前致谢。

从您的错误消息中获取路径:

en_core_web_sm-2.3.1/config.cfg

你有一个 v2.3 的模型,但它正在寻找一个 config.cfg,这只是 spaCy v3 中的一个东西。看来你不知不觉升级了 spaCy。

有两种方法可以解决这个问题。一种是使用 spacy download 重新安装模型,这将获得与您当前的 spaCy 版本匹配的版本。如果您刚刚开始做某事,那可能是最好的主意。根据 scrubadub 的发布日期,它似乎适用于 spaCy v3。

但是,请注意 v2 和 v3 非常不同 - 如果您有一个使用 spaCy v2 的项目,您可能想要降级。