StanfordNLP、CoreNLP、spaCy——不同的依赖图

Question

我正在尝试使用在依赖图上定义的简单 rules/patterns 从句子中提取非常基本的信息（例如，主语->谓语->宾语等三元组）。我开始使用 StanfordNLP since it was easy to set up and utlizes the GPU for better performance. However, I've noticed that for some sentences, the resulting dependency graph looked not as I would have expected -- I'm no expert though. I therefore tried two other solutions: spaCy and Stanford CoreNLP（我知道这些是由不同的组维护的？）

对于例句 "Tom made Sam believe that Alice has cancer." 我已经打印了所有三种方法的依赖关系。 CoreNLP 和 spaCy 产生相同的依赖关系，但它们与 StanfordNLP 的依赖关系不同。因此，我倾向于切换到 CoreNLP 和 spaCy（另一个优点是它们开箱即用 NER）。

有没有人有更多经验或反馈可以帮助从这里走向何方？我不希望 CoreNLP 和 spaCy 总是在相同的依赖图中产生，但在示例句子中，将 Sam 视为 obj 作为 StandfordNLP 所做的与 nsubj 相比（CoreNLP , spaCy) 似乎有显着差异

Format:
token   dependency_tag   parent_token

StanfordNLP
Tom     nsubj   made
made    ROOT    ROOT
Sam     obj     made
believe ccomp   made
that    mark    has
Alice   nsubj   has
has     ccomp   believe
cancer  obj     has
.       punct   made

CoreNLP
Tom     nsubj   made
made    ROOT    ROOT
Sam     nsubj   believe
believe ccomp   made
that    mark    has
Alice   nsubj   has
has     ccomp   believe
cancer  dobj    has
.       punct   made

spaCy
Tom     nsubj   made
made    ROOT    ROOT
Sam     nsubj   believe
believe ccomp   made
that    mark    has
Alice   nsubj   has
has     ccomp   believe
cancer  dobj    has
.       punct   made

Answer 1

不确定如何解决您的问题，但我建议您仔细阅读 Stanford CoreNLP 的文档：https://nlp.stanford.edu/software/lex-parser.shtml

在这个包中，有几个语法分析器和依赖分析器可供您使用。只看语法分析，有一个检索 k-best 分析的选项，如果你处理它们的依赖关系，你很可能会得到每个不同的依赖关系。

这与解析器的不准确性和自然语言的歧义有关。

StanfordNLP、CoreNLP、spaCy——不同的依赖图

StanfordNLP, CoreNLP, spaCy - different dependency graphs

nlp

stanford-nlp

spacy

dependency-parsing