如何使用 Stanford 依赖解析从文本文件中解析多个句子？

Question

我有一个有很多行的文本文件，我想解析所有的句子，但似乎我得到了所有的句子，但只解析了第一句，不知道我哪里错了。

import nltk
from nltk.parse.stanford import StanfordDependencyParser
dependency_parser = StanfordDependencyParser(  model_path="edu\stanford\lp\models\lexparser\englishPCFG.ser.gz")
txtfile =open('sample.txt',encoding="latin-1")
s=txtfile.read()
print(s)
result = dependency_parser.raw_parse(s)
for i in result:
print(list(i.triples()))

但它只给出了第一句解析三重句而不是其他句子，有什么帮助吗？

'i like this computer'
'The great Buddha, the .....'
'My Ashford experience .... great experience.'


[[(('i', 'VBZ'), 'nsubj', ("'", 'POS')), (('i', 'VBZ'), 'nmod', ('computer', 'NN')), (('computer', 'NN'), 'case', ('like', 'IN')), (('computer', 'NN'), 'det', ('this', 'DT')), (('computer', 'NN'), 'case', ("'", 'POS'))]]

Answer 1

您必须先拆分文本。您目前正在解析您发布的带有引号和所有内容的文字。这部分解析结果很明显：("'", 'POS')

要做到这一点，您似乎可以在每一行上使用 ast.literal_eval。请注意，撇号（在 "don't" 之类的词中）会破坏格式，您必须自己使用 line = line[1:-1]:

之类的方式处理撇号

import ast
from nltk.parse.stanford import StanfordDependencyParser
dependency_parser = StanfordDependencyParser(  model_path="edu\stanford\lp\models\lexparser\englishPCFG.ser.gz")

with open('sample.txt',encoding="latin-1") as f:
    lines = [ast.litral_eval(line) for line in f.readlines()]

for line in lines:
    parsed_lines = dependency_parser.raw_parse(line)

# now parsed_lines should contain the parsed lines from the file

Answer 2

尝试：

from nltk.parse.stanford import StanfordDependencyParser
dependency_parser = StanfordDependencyParser(model_path="edu\stanford\lp\models\lexparser\englishPCFG.ser.gz")

with open('sample.txt') as fin:
    sents = fin.readlines()
result = dep_parser.raw_parse_sents(sents)
for parse in results:
    print list(parse.triples())

请查看 docstring code or demo code in repository 中的示例，它们通常很有帮助。

如何使用 Stanford 依赖解析从文本文件中解析多个句子？

How to parse more than one sentence from text file using Stanford dependency parse?

parsing

nltk

stanford-nlp

triples

python-3.x