具有特定名词的 Spacy 匹配器模式

Spacy matcher pattern with specifics nouns

我正在尝试匹配特定模式:任何带有以 s、t 或 l 结尾的名词的动词。 例如。: 像猫一样, 吃饭, 制作香料

我该怎么做?

我知道我在这样做:

nlp =spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocable)
pattern = [{"POS": "VERB"}, {"POS": "NOUN"}]
matcher.add("mypattern", [pattern])
​doc = nlp(Verbwithnoun)
matches = matcher(doc)

for match_id, start, end in matches:
string_id = nlp.vocab.strings[match_id] 
print(doc[start:end)

但这会打印所有带名词的动词,而不是以 t、l 或 s 结尾的名词。我怎样才能让 spacy 只匹配以 t、l 或 s 结尾的特定名词?

您可以post-通过检查您得到的短语是否以以下三个字母中的任何一个结尾来处理结果:

import spacy
from spacy.matcher import Matcher

nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
pattern = [{"POS": "VERB"}, {"POS": "DET", "OP" : "?"}, {"POS": "NOUN"}]
matcher.add("mypattern", [pattern])
Verbwithnoun = "I know the language. I like the cat, I eat a meal, I make spices."
doc = nlp(Verbwithnoun)
matches = matcher(doc)

for match_id, start, end in matches:
    string_id = nlp.vocab.strings[match_id] 
    phrase = doc[start:end]
    if phrase.text.endswith('s') or phrase.text.endswith('t') or phrase.text.endswith('l'):
        print(doc[start:end])

输出:

like the cat
eat a meal
make spices

Post 处理很好,但您也可以直接在模式中使用正则表达式。参见 the docs

nlp =spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocable)
pattern = [{"POS": "VERB"}, {"POS": "NOUN", "TEXT": {"REGEX": "[lst]$"}}]
matcher.add("mypattern", [pattern])
​doc = nlp(Verbwithnoun)
matches = matcher(doc)

for match_id, start, end in matches:
string_id = nlp.vocab.strings[match_id] 
print(doc[start:end)