为给定单词之前的单词提取 POS 标记

Question

我是 python 的新手，我正在尝试为给定单词之前的单词提取词性 (Stanford CoreNLP)。 for the text = "一个人用白面包准备食物，旁边有一只黑猫？"

这是我的代码

for i in nouns:             
    pattren ="\w+(?=\s*"+i+"[^/])"
    re1 = re.search(pattren , text)
    if(re1):
        for tag in tagger.tag(text.split()):       #POS tag extractor
            if re1[0] in tag[1]:
                for specific in tag[1].split():
                    if re1[0] in specific:
                        print("The Noun " + i + ":-")
                        print(specific)

其中nouns是一个包含文本中所有NN的数组['human', 'food', 'use', 'side', 'cat'] 我尝试使用正则表达式提取

之前的单词

输出是：

The Noun طعام:-
يحضر/VBP
The Noun استخدام:-
ب/IN
The Noun استخدام:-
الخبز/DTNN
The Noun استخدام:-
الابيض/DTJJ
The Noun استخدام:-
ب/IN
The Noun استخدام:-
جانب/NN
The Noun جانب:-
ب/IN
The Noun جانب:-
الخبز/DTNN
The Noun جانب:-
الابيض/DTJJ
The Noun جانب:-
ب/IN
The Noun جانب:-
جانب/NN
The Noun قطة:-
ه/PRP$
The Noun قطة:-
ه/PRP$

有重复的字，实在是没法做题。

Answer 1

问题在行

if re1[0] in tag[1]:

这会获取与 re1[0] 匹配的 tag[1] 字符串中的所有单词，无论它是单词还是字符。

解决方案，我尝试使用正则表达式来获取标签[1]中的确切单词。

if re.match(r'\b'+ re1[0]+'(?!\.?\d)', tag[1]):

为给定单词之前的单词提取 POS 标记

Extract POS tag for a word coming before a given word

python

part-of-speech

stanford-nlp