如何使用 NLP 识别句子中的肇事者和受害者?
How can I identify the perpetrator and victim in a sentence using NLP?
我是 NLP 的新手,正在寻找可以帮助我确定主题的主题进行探索。具体来说,受害者和攻击者在以下情况下:
The UK was attacked by China over several weeks
Over several weeks, China attacked the UK.
使用 SpaCy,我已经确定了主题,但它们会根据位置而变化:
nlp = spacy.load("en_core_web_sm")
doc1 = nlp("China attacked the UK over several weeks")
doc2 = nlp("The UK was attacked by China over several weeks")
docs = [doc1, doc2]
for doc in docs:
print("============")
for chunk in doc.noun_chunks:
print(chunk.text, chunk.root.text, chunk.root.dep_,
chunk.root.head.text)
输出:
============
China China nsubj attacked
the UK UK dobj attacked
several weeks weeks pobj over
============
The UK UK nsubjpass attacked
China China pobj by
several weeks weeks pobj over
如有任何帮助和指导,我们将不胜感激。
这叫做语义角色标注,很难。在 spaCy 中,我们的一般建议是不要将其建模为 NER,而是使用通用的 NER 标签,如 PERSON(或此处的 GPE)和依赖项解析,看看在考虑其他方法之前你能走多远。
请参阅 spaCy 课程 chapter 4 中的第 10 节,了解有关此问题的非常具体的概述。
要了解有关该主题的研究概况
我是 NLP 的新手,正在寻找可以帮助我确定主题的主题进行探索。具体来说,受害者和攻击者在以下情况下:
The UK was attacked by China over several weeks
Over several weeks, China attacked the UK.
使用 SpaCy,我已经确定了主题,但它们会根据位置而变化:
nlp = spacy.load("en_core_web_sm")
doc1 = nlp("China attacked the UK over several weeks")
doc2 = nlp("The UK was attacked by China over several weeks")
docs = [doc1, doc2]
for doc in docs:
print("============")
for chunk in doc.noun_chunks:
print(chunk.text, chunk.root.text, chunk.root.dep_,
chunk.root.head.text)
输出:
============
China China nsubj attacked
the UK UK dobj attacked
several weeks weeks pobj over
============
The UK UK nsubjpass attacked
China China pobj by
several weeks weeks pobj over
如有任何帮助和指导,我们将不胜感激。
这叫做语义角色标注,很难。在 spaCy 中,我们的一般建议是不要将其建模为 NER,而是使用通用的 NER 标签,如 PERSON(或此处的 GPE)和依赖项解析,看看在考虑其他方法之前你能走多远。
请参阅 spaCy 课程 chapter 4 中的第 10 节,了解有关此问题的非常具体的概述。
要了解有关该主题的研究概况