Python Spacy error: RuntimeError: Language not supported
Python Spacy error: RuntimeError: Language not supported
我将向自己的 spacy 数据模型添加新实体 "mymodel"。在我使用这个 tutorial 安装 "mymodel" 之前,它运行良好。当我想使用 "mymodel" 添加新实体时,我有一个误解。请帮帮我
这是我的代码:
import plac
from spacy.en import English
from spacy.gold import GoldParse
import spacy
nlp = spacy.load('mymodel')
def main(out_loc):
nlp = English(parser=False) # Avoid loading the parser, for quick load times
# Run the tokenizer and tagger (but not the entity recognizer)
doc = nlp.tokenizer(u'Lions and tigers and grizzly bears!')
nlp.tagger(doc)
nlp.entity.add_label('ANIMAL') # <-- New in v0.100
# Create a GoldParse object. This should have a better API...
indices = tuple(range(len(doc)))
words = [w.text for w in doc]
tags = [w.tag_ for w in doc]
heads = [0 for _ in doc]
deps = ['' for _ in doc]
# This is the only part we care about. We want BILOU format
ner = ['U-ANIMAL', 'O', 'U-ANIMAL', 'O', 'B-ANIMAL', 'L-ANIMAL', 'O']
# Create the GoldParse
annot = GoldParse(doc, (indices, words, tags, heads, deps, ner))
# Update the weights with the example
# Here we iterate until we get it entirely correct. In practice this is probably a bad idea.
# Note that we've added a class to the existing model here! We "resume"
# training the previous model. Whether this is good or not I can't say, you'll have to
# experiment.
loss = nlp.entity.train(doc, annot)
i = 0
while loss != 0 and i < 1000:
loss = nlp.entity.train(doc, annot)
i += 1
print("Used %d iterations" % i)
nlp.entity(doc)
for ent in doc.ents:
print(ent.text, ent.label_)
nlp.entity.model.dump(out_loc)
if __name__ == '__main__':
plac.call(main)
**Error of output:**
Traceback (most recent call last):
File "/home/vv/webapp/dic_model.py", line 7, in <module>
nlp = spacy.load('mymodel')
File "/usr/local/lib/python3.5/dist-packages/spacy/__init__.py", line 26, in load
lang_name = util.get_lang_class(name).lang
File "/usr/local/lib/python3.5/dist-packages/spacy/util.py", line 27, in get_lang_class
raise RuntimeError('Language not supported: %s' % name)
RuntimeError: Language not supported: mymodel
这里的问题是 spacy.load()
目前需要一个语言 ID(例如 'en'
),或者 shortcut link 到一个模型,告诉 spaCy 在哪里可以找到数据。因为spaCy找不到快捷方式link,它假设'my_model'
是一种语言,显然不存在
您可以像这样为您的模型设置 link:
python -m spacy link my_model my_model # if it's installed via pip, or:
python -m spacy link /path/to/my_model/data my_model
这将在 /spacy/data
目录中创建一个 symlink,因此您应该 运行 它具有管理员权限。
或者,如果您创建了一个可以通过 pip 安装的 model package,您可以简单地安装并导入它,然后不带参数调用它的 load()
方法:
import my_model
nlp = my_model.load()
在某些情况下,这种加载模型的方式实际上更方便,因为它更简洁并且可以让您更轻松地调试代码。例如,如果模型不存在,Python 将立即引发 ImportError
。同样,如果加载失败,您知道模型自身的加载和元数据可能存在问题。
顺便说一句:我是 spaCy 的维护者之一,我同意 spacy.load()
目前的工作方式绝对不理想且令人困惑。我们期待在下一个主要版本中最终改变这一点。我们非常接近发布 v2.0 的第一个 alpha,它将更优雅地解决这个问题,并且还将包括对训练过程和文档的大量改进。
我将向自己的 spacy 数据模型添加新实体 "mymodel"。在我使用这个 tutorial 安装 "mymodel" 之前,它运行良好。当我想使用 "mymodel" 添加新实体时,我有一个误解。请帮帮我
这是我的代码:
import plac
from spacy.en import English
from spacy.gold import GoldParse
import spacy
nlp = spacy.load('mymodel')
def main(out_loc):
nlp = English(parser=False) # Avoid loading the parser, for quick load times
# Run the tokenizer and tagger (but not the entity recognizer)
doc = nlp.tokenizer(u'Lions and tigers and grizzly bears!')
nlp.tagger(doc)
nlp.entity.add_label('ANIMAL') # <-- New in v0.100
# Create a GoldParse object. This should have a better API...
indices = tuple(range(len(doc)))
words = [w.text for w in doc]
tags = [w.tag_ for w in doc]
heads = [0 for _ in doc]
deps = ['' for _ in doc]
# This is the only part we care about. We want BILOU format
ner = ['U-ANIMAL', 'O', 'U-ANIMAL', 'O', 'B-ANIMAL', 'L-ANIMAL', 'O']
# Create the GoldParse
annot = GoldParse(doc, (indices, words, tags, heads, deps, ner))
# Update the weights with the example
# Here we iterate until we get it entirely correct. In practice this is probably a bad idea.
# Note that we've added a class to the existing model here! We "resume"
# training the previous model. Whether this is good or not I can't say, you'll have to
# experiment.
loss = nlp.entity.train(doc, annot)
i = 0
while loss != 0 and i < 1000:
loss = nlp.entity.train(doc, annot)
i += 1
print("Used %d iterations" % i)
nlp.entity(doc)
for ent in doc.ents:
print(ent.text, ent.label_)
nlp.entity.model.dump(out_loc)
if __name__ == '__main__':
plac.call(main)
**Error of output:**
Traceback (most recent call last):
File "/home/vv/webapp/dic_model.py", line 7, in <module>
nlp = spacy.load('mymodel')
File "/usr/local/lib/python3.5/dist-packages/spacy/__init__.py", line 26, in load
lang_name = util.get_lang_class(name).lang
File "/usr/local/lib/python3.5/dist-packages/spacy/util.py", line 27, in get_lang_class
raise RuntimeError('Language not supported: %s' % name)
RuntimeError: Language not supported: mymodel
这里的问题是 spacy.load()
目前需要一个语言 ID(例如 'en'
),或者 shortcut link 到一个模型,告诉 spaCy 在哪里可以找到数据。因为spaCy找不到快捷方式link,它假设'my_model'
是一种语言,显然不存在
您可以像这样为您的模型设置 link:
python -m spacy link my_model my_model # if it's installed via pip, or:
python -m spacy link /path/to/my_model/data my_model
这将在 /spacy/data
目录中创建一个 symlink,因此您应该 运行 它具有管理员权限。
或者,如果您创建了一个可以通过 pip 安装的 model package,您可以简单地安装并导入它,然后不带参数调用它的 load()
方法:
import my_model
nlp = my_model.load()
在某些情况下,这种加载模型的方式实际上更方便,因为它更简洁并且可以让您更轻松地调试代码。例如,如果模型不存在,Python 将立即引发 ImportError
。同样,如果加载失败,您知道模型自身的加载和元数据可能存在问题。
顺便说一句:我是 spaCy 的维护者之一,我同意 spacy.load()
目前的工作方式绝对不理想且令人困惑。我们期待在下一个主要版本中最终改变这一点。我们非常接近发布 v2.0 的第一个 alpha,它将更优雅地解决这个问题,并且还将包括对训练过程和文档的大量改进。