从 SPACY v2.0 中的标记化句子中查找命名实体
find named entities from tokenized sentences in SPACY v2.0
我正在尝试做:
- 标记文本中的句子
- 为句子中出现的每个单词计算命名实体
这是我目前所做的:
nlp = spacy.load('en')
sentence = "Germany and U.S.A are popular countries. I am going to gym tonight"
sentence = nlp(sentence)
tokenized_sentences = []
for sent in sentence.sents:
tokenized_sentences.append(sent)
for s in tokenized_sentences:
labels = [ent.label_ for ent in s.ents]
entities = [ent.text for ent in s.ents]
错误:
labels = [ent.label_ for ent in s.ents]
AttributeError: 'spacy.tokens.span.Span' object has no attribute 'ents'
有没有其他方法可以找到标记化句子的命名实体?
提前致谢
请注意,您只有两个实体 - 美国和德国。
简单版:
sentence = nlp("Germany and U.S.A are popular countries. I am going to gym tonight")
for ent in sentence.ents:
print(ent.text, ent.label_)
我认为你想要做什么:
sentence = nlp("Germany and U.S.A are popular countries. I am going to gym tonight")
for sent in sentence.sents:
tmp = nlp(str(sent))
for ent in tmp.ents:
print(ent.text, ent.label_)
ents
仅适用于文档 (spacy.tokens.doc.Doc
),如果您使用 doc=nlp(text)
发送的类型 spacy.tokens.span.Span
没有 ents
方法。
将其转换为文本并再次使用 nlp()
。
print([(ent.text, ent.label_) for ent in nlp(sent.text).ents])
我正在尝试做:
- 标记文本中的句子
- 为句子中出现的每个单词计算命名实体
这是我目前所做的:
nlp = spacy.load('en')
sentence = "Germany and U.S.A are popular countries. I am going to gym tonight"
sentence = nlp(sentence)
tokenized_sentences = []
for sent in sentence.sents:
tokenized_sentences.append(sent)
for s in tokenized_sentences:
labels = [ent.label_ for ent in s.ents]
entities = [ent.text for ent in s.ents]
错误:
labels = [ent.label_ for ent in s.ents]
AttributeError: 'spacy.tokens.span.Span' object has no attribute 'ents'
有没有其他方法可以找到标记化句子的命名实体?
提前致谢
请注意,您只有两个实体 - 美国和德国。
简单版:
sentence = nlp("Germany and U.S.A are popular countries. I am going to gym tonight")
for ent in sentence.ents:
print(ent.text, ent.label_)
我认为你想要做什么:
sentence = nlp("Germany and U.S.A are popular countries. I am going to gym tonight")
for sent in sentence.sents:
tmp = nlp(str(sent))
for ent in tmp.ents:
print(ent.text, ent.label_)
ents
仅适用于文档 (spacy.tokens.doc.Doc
),如果您使用 doc=nlp(text)
发送的类型 spacy.tokens.span.Span
没有 ents
方法。
将其转换为文本并再次使用 nlp()
。
print([(ent.text, ent.label_) for ent in nlp(sent.text).ents])