
Extract a path of dependency relations from the ROOT to a token? SPACY

提取一条从ROOT到token的依赖关系路径?空间。 我的代码提取了整个路径

import spacy

sentence = "I saw the man with a telescop"

nlp = spacy.load('en')
doc = nlp(sentence)

for sent in doc.sents:
    for token in sent:
        print("{}\t{}\t{}\t{}".format(token.i, token.text, token.head, token.dep_))

依赖树基本上是一个图,所以如果你想找到到 ROOT 的(最短)路径,你需要使用一些基于图的库,比如 networkx。假设您要提取从令牌“伸缩”到根的路径。那么你可以尝试做这样的事情:

import spacy
import networkx

sentence = "I saw the man with a telescop"

nlp = spacy.load('en_core_web_sm')
doc = nlp(sentence)
edges = []

for sent in doc.sents:
    for token in sent:
        print("{}\t{}\t{}\t{}".format(token.i, token.text, token.head, token.dep_))
        if token.dep_ == "ROOT":
            target = token.text
        for child in token.children:
            edges.append(("{0}".format(token.lower_), "{0}".format(child.lower_)))

graph = networkx.Graph(edges)
print(nx.shortest_path(graph, source="telescop", target=target))


0   I   saw nsubj
1   saw saw ROOT
2   the man det
3   man saw dobj
4   with    saw prep
5   a   telescop    det
6   telescop    with    pobj
['telescop', 'with', 'saw']