StanfordNLP、CoreNLP、spaCy——不同的依赖图
StanfordNLP, CoreNLP, spaCy - different dependency graphs
我正在尝试使用在依赖图上定义的简单 rules/patterns 从句子中提取非常基本的信息(例如,主语->谓语->宾语等三元组)。我开始使用 StanfordNLP since it was easy to set up and utlizes the GPU for better performance. However, I've noticed that for some sentences, the resulting dependency graph looked not as I would have expected -- I'm no expert though. I therefore tried two other solutions: spaCy and Stanford CoreNLP(我知道这些是由不同的组维护的?)
对于例句 "Tom made Sam believe that Alice has cancer." 我已经打印了所有三种方法的依赖关系。 CoreNLP 和 spaCy 产生相同的依赖关系,但它们与 StanfordNLP 的依赖关系不同。因此,我倾向于切换到 CoreNLP 和 spaCy(另一个优点是它们开箱即用 NER)。
有没有人有更多经验或反馈可以帮助从这里走向何方?我不希望 CoreNLP 和 spaCy 总是在相同的依赖图中产生,但在示例句子中,将 Sam
视为 obj
作为 StandfordNLP 所做的与 nsubj
相比(CoreNLP , spaCy) 似乎有显着差异
Format:
token dependency_tag parent_token
StanfordNLP
Tom nsubj made
made ROOT ROOT
Sam obj made
believe ccomp made
that mark has
Alice nsubj has
has ccomp believe
cancer obj has
. punct made
CoreNLP
Tom nsubj made
made ROOT ROOT
Sam nsubj believe
believe ccomp made
that mark has
Alice nsubj has
has ccomp believe
cancer dobj has
. punct made
spaCy
Tom nsubj made
made ROOT ROOT
Sam nsubj believe
believe ccomp made
that mark has
Alice nsubj has
has ccomp believe
cancer dobj has
. punct made
不确定如何解决您的问题,但我建议您仔细阅读 Stanford CoreNLP 的文档:https://nlp.stanford.edu/software/lex-parser.shtml
在这个包中,有几个语法分析器和依赖分析器可供您使用。只看语法分析,有一个检索 k-best 分析的选项,如果你处理它们的依赖关系,你很可能会得到每个不同的依赖关系。
这与解析器的不准确性和自然语言的歧义有关。
我正在尝试使用在依赖图上定义的简单 rules/patterns 从句子中提取非常基本的信息(例如,主语->谓语->宾语等三元组)。我开始使用 StanfordNLP since it was easy to set up and utlizes the GPU for better performance. However, I've noticed that for some sentences, the resulting dependency graph looked not as I would have expected -- I'm no expert though. I therefore tried two other solutions: spaCy and Stanford CoreNLP(我知道这些是由不同的组维护的?)
对于例句 "Tom made Sam believe that Alice has cancer." 我已经打印了所有三种方法的依赖关系。 CoreNLP 和 spaCy 产生相同的依赖关系,但它们与 StanfordNLP 的依赖关系不同。因此,我倾向于切换到 CoreNLP 和 spaCy(另一个优点是它们开箱即用 NER)。
有没有人有更多经验或反馈可以帮助从这里走向何方?我不希望 CoreNLP 和 spaCy 总是在相同的依赖图中产生,但在示例句子中,将 Sam
视为 obj
作为 StandfordNLP 所做的与 nsubj
相比(CoreNLP , spaCy) 似乎有显着差异
Format:
token dependency_tag parent_token
StanfordNLP
Tom nsubj made
made ROOT ROOT
Sam obj made
believe ccomp made
that mark has
Alice nsubj has
has ccomp believe
cancer obj has
. punct made
CoreNLP
Tom nsubj made
made ROOT ROOT
Sam nsubj believe
believe ccomp made
that mark has
Alice nsubj has
has ccomp believe
cancer dobj has
. punct made
spaCy
Tom nsubj made
made ROOT ROOT
Sam nsubj believe
believe ccomp made
that mark has
Alice nsubj has
has ccomp believe
cancer dobj has
. punct made
不确定如何解决您的问题,但我建议您仔细阅读 Stanford CoreNLP 的文档:https://nlp.stanford.edu/software/lex-parser.shtml
在这个包中,有几个语法分析器和依赖分析器可供您使用。只看语法分析,有一个检索 k-best 分析的选项,如果你处理它们的依赖关系,你很可能会得到每个不同的依赖关系。
这与解析器的不准确性和自然语言的歧义有关。