在代码中的 Python 函数中指定 NLTK 特征语法

Question

我已经通过加载 NLTK 书中给出的 .fcfg 文件中指定的语法来解析输入字符串。无论如何在 Python 函数本身中指定这个语法？

语法：

% start S
S[SEM=(?np + WHERE + ?vp)] -> NP[SEM=?np] VP[SEM=?vp]
VP[SEM=(?v + ?pp)] -> IV[SEM=?v] PP[SEM=?pp]
VP[SEM=(?v + ?ap)] -> IV[SEM=?v] AP[SEM=?ap]
NP[SEM=(?det + ?n)] -> Det[SEM=?det] N[SEM=?n]
PP[SEM=(?p + ?np)] -> P[SEM=?p] NP[SEM=?np]
AP[SEM=?pp] -> A[SEM=?a] PP[SEM=?pp]
NP[SEM='Country="greece"'] -> 'Greece'
NP[SEM='Country="china"'] -> 'China'
Det[SEM='SELECT'] -> 'Which' | 'What'
N[SEM='City FROM city_table'] -> 'cities'
IV[SEM=''] -> 'are'
A[SEM=''] -> 'located'
P[SEM=''] -> 'in'

我需要这个，因为我需要动态创建语法 w.r.t 以在解析之前输入字符串。

Answer 1

是，使用nltk.grammar.FeatureGrammar.fromstring()函数，例如

from nltk import grammar, parse
from nltk.parse.generate import generate

# If person is always 3rd, we can skip the PERSON feature.
g = """
S[SEM=(?np + WHERE + ?vp)] -> NP[SEM=?np] VP[SEM=?vp]
VP[SEM=(?v + ?pp)] -> IV[SEM=?v] PP[SEM=?pp]
VP[SEM=(?v + ?ap)] -> IV[SEM=?v] AP[SEM=?ap]
NP[SEM=(?det + ?n)] -> Det[SEM=?det] N[SEM=?n]
PP[SEM=(?p + ?np)] -> P[SEM=?p] NP[SEM=?np]
AP[SEM=?pp] -> A[SEM=?a] PP[SEM=?pp]
NP[SEM='Country="greece"'] -> 'Greece'
NP[SEM='Country="china"'] -> 'China'
Det[SEM='SELECT'] -> 'Which' | 'What'
N[SEM='City FROM city_table'] -> 'cities'
IV[SEM=''] -> 'are'
A[SEM=''] -> 'located'
P[SEM=''] -> 'in'
"""

grammar =  grammar.FeatureGrammar.fromstring(g)

for sent in generate(grammar, n=30):
    print(sent)

[输出]:

['Which', 'cities', 'are', 'in', 'Which', 'cities']
['Which', 'cities', 'are', 'in', 'What', 'cities']
['Which', 'cities', 'are', 'in', 'Greece']
['Which', 'cities', 'are', 'in', 'China']
['Which', 'cities', 'are', 'located', 'in', 'Which', 'cities']
['Which', 'cities', 'are', 'located', 'in', 'What', 'cities']
['Which', 'cities', 'are', 'located', 'in', 'Greece']
['Which', 'cities', 'are', 'located', 'in', 'China']
['What', 'cities', 'are', 'in', 'Which', 'cities']
['What', 'cities', 'are', 'in', 'What', 'cities']
['What', 'cities', 'are', 'in', 'Greece']
['What', 'cities', 'are', 'in', 'China']
['What', 'cities', 'are', 'located', 'in', 'Which', 'cities']
['What', 'cities', 'are', 'located', 'in', 'What', 'cities']
['What', 'cities', 'are', 'located', 'in', 'Greece']
['What', 'cities', 'are', 'located', 'in', 'China']
['Greece', 'are', 'in', 'Which', 'cities']
['Greece', 'are', 'in', 'What', 'cities']
['Greece', 'are', 'in', 'Greece']
['Greece', 'are', 'in', 'China']
['Greece', 'are', 'located', 'in', 'Which', 'cities']
['Greece', 'are', 'located', 'in', 'What', 'cities']
['Greece', 'are', 'located', 'in', 'Greece']
['Greece', 'are', 'located', 'in', 'China']
['China', 'are', 'in', 'Which', 'cities']
['China', 'are', 'in', 'What', 'cities']
['China', 'are', 'in', 'Greece']
['China', 'are', 'in', 'China']
['China', 'are', 'located', 'in', 'Which', 'cities']
['China', 'are', 'located', 'in', 'What', 'cities']

在代码中的 Python 函数中指定 NLTK 特征语法

Specify NLTK feature grammar within Python function in code

python

nlp

nltk

context-free-grammar

semantics