NER 的 deeppavlov 训练一直失败
Training on deeppavlov for NER keeps failing
我一直在尝试根据他们的文档中给出的训练语法为 NER 训练一个 deeppavlov 模型,但它一直失败并显示以下错误消息:
/opt/anaconda3/envs/py36/lib/python3.6/site-packages/deeppavlov/dataset_readers/conll2003_reader.py in parse_ner_file(self, file_name)
104 items = line.split()
105 if len(items) < expected_items:
--> 106 raise Exception(f"Input is not valid {line}")
107 tokens.append(items[0])
108 tags.append(items[-1])
Exception: Input is not valid aio-pika==6.4.1
使用以下代码训练 deeppavlov 模型,它似乎在他们的样本数据集上工作,但是当我按照他们的训练样本指南创建自己的数据集时,我不断收到上述错误消息。
培训ner代码:
from deeppavlov import configs, train_model, build_model
from deeppavlov.core.commands.utils import parse_config
import json
with configs.ner.ner_ontonotes_bert_mult.open(encoding='utf8') as f:
ner_config = json.load(f)
ner_config['dataset_reader']['data_path'] = '/Users/smankari001/deeppavlov' # directory with train.txt, valid.txt and test.txt files
ner_config['metadata']['variables']['NER_PATH'] = '/Users/smankari001/deeppavlov'
ner_config['metadata']['download'] = [ner_config['metadata']['download'][-1]] # do not download the pretrained ontonotes model
ner_model = train_model(ner_config, download=True)
输入train.txt文件:
What O
kind O
of O
memory O
? O
We O
respectfully O
invite O
you O
to O
watch O
a O
special O
edition O
of O
Across B-ORG
China I-ORG
. O
WW B-WORK_OF_ART
II I-WORK_OF_ART
Landmarks I-WORK_OF_ART
on I-WORK_OF_ART
the I-WORK_OF_ART
Great I-WORK_OF_ART
Earth I-WORK_OF_ART
of I-WORK_OF_ART
China I-WORK_OF_ART
: I-WORK_OF_ART
Eternal I-WORK_OF_ART
Memories I-WORK_OF_ART
of I-WORK_OF_ART
Taihang I-WORK_OF_ART
Mountain I-WORK_OF_ART
Standing O
tall O
on O
Taihang B-LOC
Mountain I-LOC
is O
the B-WORK_OF_ART
Monument I-WORK_OF_ART
to I-WORK_OF_ART
the I-WORK_OF_ART
Hundred I-WORK_OF_ART
Regiments I-WORK_OF_ART
Offensive I-WORK_OF_ART
. O
It O
is O
composed O
of O
a O
primary O
stele O
, O
secondary O
steles O
, O
a O
huge O
round O
sculpture O
and O
beacon O
tower O
, O
and O
the B-WORK_OF_ART
Great I-WORK_OF_ART
Wall I-WORK_OF_ART
, O
among O
other O
things O
. O
A O
primary O
stele O
, O
three B-CARDINAL
secondary O
steles O
, O
and O
two B-CARDINAL
inscribed O
steles O
. O
The B-EVENT
Hundred I-EVENT
Regiments I-EVENT
Offensive I-EVENT
was O
the O
campaign O
of O
the O
largest O
scale O
launched O
by O
the B-ORG
Eighth I-ORG
Route I-ORG
Army I-ORG
during O
the B-EVENT
War I-EVENT
of I-EVENT
Resistance I-EVENT
against I-EVENT
Japan I-EVENT
. O
This O
campaign O
broke O
through O
the O
Japanese B-NORP
army O
's O
blockade O
to O
reach O
base O
areas O
behind O
enemy O
lines O
, O
stirring O
up O
anti-Japanese B-NORP
spirit O
throughout O
the O
nation O
and O
influencing O
the O
situation O
of O
the O
anti-fascist O
war O
of O
the O
people O
worldwide O
. O
作为 ner_config['dataset_reader']['data_path']
,您需要指定仅包含数据集文件 (train/valid/test) 的文件夹路径。
这个错误:
Exception: Input is not valid aio-pika==6.4.1
表示 DatasetReader 开始从 requirements.txt
文件中读取行。
我一直在尝试根据他们的文档中给出的训练语法为 NER 训练一个 deeppavlov 模型,但它一直失败并显示以下错误消息:
/opt/anaconda3/envs/py36/lib/python3.6/site-packages/deeppavlov/dataset_readers/conll2003_reader.py in parse_ner_file(self, file_name)
104 items = line.split()
105 if len(items) < expected_items:
--> 106 raise Exception(f"Input is not valid {line}")
107 tokens.append(items[0])
108 tags.append(items[-1])
Exception: Input is not valid aio-pika==6.4.1
使用以下代码训练 deeppavlov 模型,它似乎在他们的样本数据集上工作,但是当我按照他们的训练样本指南创建自己的数据集时,我不断收到上述错误消息。 培训ner代码:
from deeppavlov import configs, train_model, build_model
from deeppavlov.core.commands.utils import parse_config
import json
with configs.ner.ner_ontonotes_bert_mult.open(encoding='utf8') as f:
ner_config = json.load(f)
ner_config['dataset_reader']['data_path'] = '/Users/smankari001/deeppavlov' # directory with train.txt, valid.txt and test.txt files
ner_config['metadata']['variables']['NER_PATH'] = '/Users/smankari001/deeppavlov'
ner_config['metadata']['download'] = [ner_config['metadata']['download'][-1]] # do not download the pretrained ontonotes model
ner_model = train_model(ner_config, download=True)
输入train.txt文件:
What O
kind O
of O
memory O
? O
We O
respectfully O
invite O
you O
to O
watch O
a O
special O
edition O
of O
Across B-ORG
China I-ORG
. O
WW B-WORK_OF_ART
II I-WORK_OF_ART
Landmarks I-WORK_OF_ART
on I-WORK_OF_ART
the I-WORK_OF_ART
Great I-WORK_OF_ART
Earth I-WORK_OF_ART
of I-WORK_OF_ART
China I-WORK_OF_ART
: I-WORK_OF_ART
Eternal I-WORK_OF_ART
Memories I-WORK_OF_ART
of I-WORK_OF_ART
Taihang I-WORK_OF_ART
Mountain I-WORK_OF_ART
Standing O
tall O
on O
Taihang B-LOC
Mountain I-LOC
is O
the B-WORK_OF_ART
Monument I-WORK_OF_ART
to I-WORK_OF_ART
the I-WORK_OF_ART
Hundred I-WORK_OF_ART
Regiments I-WORK_OF_ART
Offensive I-WORK_OF_ART
. O
It O
is O
composed O
of O
a O
primary O
stele O
, O
secondary O
steles O
, O
a O
huge O
round O
sculpture O
and O
beacon O
tower O
, O
and O
the B-WORK_OF_ART
Great I-WORK_OF_ART
Wall I-WORK_OF_ART
, O
among O
other O
things O
. O
A O
primary O
stele O
, O
three B-CARDINAL
secondary O
steles O
, O
and O
two B-CARDINAL
inscribed O
steles O
. O
The B-EVENT
Hundred I-EVENT
Regiments I-EVENT
Offensive I-EVENT
was O
the O
campaign O
of O
the O
largest O
scale O
launched O
by O
the B-ORG
Eighth I-ORG
Route I-ORG
Army I-ORG
during O
the B-EVENT
War I-EVENT
of I-EVENT
Resistance I-EVENT
against I-EVENT
Japan I-EVENT
. O
This O
campaign O
broke O
through O
the O
Japanese B-NORP
army O
's O
blockade O
to O
reach O
base O
areas O
behind O
enemy O
lines O
, O
stirring O
up O
anti-Japanese B-NORP
spirit O
throughout O
the O
nation O
and O
influencing O
the O
situation O
of O
the O
anti-fascist O
war O
of O
the O
people O
worldwide O
. O
作为 ner_config['dataset_reader']['data_path']
,您需要指定仅包含数据集文件 (train/valid/test) 的文件夹路径。
这个错误:
Exception: Input is not valid aio-pika==6.4.1
表示 DatasetReader 开始从 requirements.txt
文件中读取行。