Pyparsing 无法解析多个规则
Pyparsing failed to parse multiple rules
我正在尝试创建具有一些特殊规则(例如相邻值和附近值)的布尔查询解析器。到目前为止我创建的规则是
## DEFINITIONS OF SYMBOLS ###
NEAR = CaselessLiteral('near').suppress()
NUMBER = Word(nums)
NONEDIRECTIONAL = Combine(NEAR+NUMBER)
ADJ = CaselessLiteral("ADJ").setParseAction(replaceWith('0'))
OAND = CaselessLiteral("and")
OOR = CaselessLiteral("or")
ONOT = CaselessLiteral("not")
## ----------------------- ##
## DEFINITIONS OF TERMS ###
# Do not break quoted string.
QUOTED = quotedString.setParseAction(removeQuotes)
# space-separated words are easiest to define using just OneOrMore
# must use a negative lookahead for and/not/or operators, and this must come
# at the beginning of the expression
WORDWITHSPACE = OneOrMore(~(OAND | ONOT | OOR | NONEDIRECTIONAL | ADJ) +
Word(printables, excludeChars="()"))
# use a parse action to recombine words into a single string
WORDWITHSPACE.addParseAction(lambda t: ' '.join(t))
TERM = (QUOTED | WORDWITHSPACE)
## ----------------------- ##
## DEFINITIONS OF Expresion ###
EXPRESSION = infixNotation(TERM,
[
(ADJ, 2, opAssoc.LEFT),
(NONEDIRECTIONAL, 2, opAssoc.LEFT),
(ONOT, 1, opAssoc.RIGHT),
(Optional(OAND, default='and'), 2, opAssoc.LEFT),
(OOR, 2, opAssoc.LEFT)
])
# As we can have more than one occurances of symbols together we are
# using `OneOrMore` Exprestions
BOOLQUERY = OneOrMore(EXPRESSION) + StringEnd()
## ----------------------- ##
当我运行
((a or b) and (b and c)) or (a and d)
效果很好
而当我尝试解析
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
代码卡住无法处理。
谁能帮我解决我哪里出错了?
更新代码:
EXPRESSION = infixNotation(TERM,
[
(ONOT, 1, opAssoc.RIGHT),
(Optional(OAND, default='and'), 2, opAssoc.LEFT),
((OOR | NONEDIRECTIONAL | ADJ), 2, opAssoc.LEFT)
])
保持可选,因为像
这样的情况
x not y not z
您的程序需要很长时间,因为您的 infixNotation
有 5 层深并且有一个可选的 AND 运算符。
我只需启用 Packrat 解析就可以 运行 按原样进行。通过添加到脚本顶部(在导入 pyparsing 之后)来执行此操作:
ParserElement.enablePackrat()
为了 运行 你的测试,我使用了 runTests
。我不清楚为什么需要 BOOLQUERY,因为您只是在解析表达式:
tests = """\
((a or b) and (b and c)) or (a and d)
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
"""
EXPRESSION.runTests(tests)
给出:
((a or b) and (b and c)) or (a and d)
[[[['a', 'or', 'b'], 'and', ['b', 'and', 'c']], 'or', ['a', 'and', 'd']]]
[0]:
[[['a', 'or', 'b'], 'and', ['b', 'and', 'c']], 'or', ['a', 'and', 'd']]
[0]:
[['a', 'or', 'b'], 'and', ['b', 'and', 'c']]
[0]:
['a', 'or', 'b']
[1]:
and
[2]:
['b', 'and', 'c']
[1]:
or
[2]:
['a', 'and', 'd']
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
[[[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']], 'or', [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]]]
[0]:
[[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']], 'or', [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]]
[0]:
[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']]
[0]:
[['smart', '0', 'contract*'], 'and', 'agreement']
[0]:
['smart', '0', 'contract*']
[1]:
and
[2]:
agreement
[1]:
or
[2]:
['enforced', '3', 'without', '3', 'interaction']
[3]:
or
[4]:
['automated', '0', 'escrow']
[1]:
or
[2]:
[['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]
[0]:
['protocol*', 'or', ['Consensus', '0', 'algorithm']]
[0]:
protocol*
[1]:
or
[2]:
['Consensus', '0', 'algorithm']
[1]:
5
[2]:
['agreement', 'and', 'transaction']
我正在尝试创建具有一些特殊规则(例如相邻值和附近值)的布尔查询解析器。到目前为止我创建的规则是
## DEFINITIONS OF SYMBOLS ###
NEAR = CaselessLiteral('near').suppress()
NUMBER = Word(nums)
NONEDIRECTIONAL = Combine(NEAR+NUMBER)
ADJ = CaselessLiteral("ADJ").setParseAction(replaceWith('0'))
OAND = CaselessLiteral("and")
OOR = CaselessLiteral("or")
ONOT = CaselessLiteral("not")
## ----------------------- ##
## DEFINITIONS OF TERMS ###
# Do not break quoted string.
QUOTED = quotedString.setParseAction(removeQuotes)
# space-separated words are easiest to define using just OneOrMore
# must use a negative lookahead for and/not/or operators, and this must come
# at the beginning of the expression
WORDWITHSPACE = OneOrMore(~(OAND | ONOT | OOR | NONEDIRECTIONAL | ADJ) +
Word(printables, excludeChars="()"))
# use a parse action to recombine words into a single string
WORDWITHSPACE.addParseAction(lambda t: ' '.join(t))
TERM = (QUOTED | WORDWITHSPACE)
## ----------------------- ##
## DEFINITIONS OF Expresion ###
EXPRESSION = infixNotation(TERM,
[
(ADJ, 2, opAssoc.LEFT),
(NONEDIRECTIONAL, 2, opAssoc.LEFT),
(ONOT, 1, opAssoc.RIGHT),
(Optional(OAND, default='and'), 2, opAssoc.LEFT),
(OOR, 2, opAssoc.LEFT)
])
# As we can have more than one occurances of symbols together we are
# using `OneOrMore` Exprestions
BOOLQUERY = OneOrMore(EXPRESSION) + StringEnd()
## ----------------------- ##
当我运行
((a or b) and (b and c)) or (a and d)
效果很好
而当我尝试解析
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
代码卡住无法处理。
谁能帮我解决我哪里出错了?
更新代码:
EXPRESSION = infixNotation(TERM,
[
(ONOT, 1, opAssoc.RIGHT),
(Optional(OAND, default='and'), 2, opAssoc.LEFT),
((OOR | NONEDIRECTIONAL | ADJ), 2, opAssoc.LEFT)
])
保持可选,因为像
这样的情况x not y not z
您的程序需要很长时间,因为您的 infixNotation
有 5 层深并且有一个可选的 AND 运算符。
我只需启用 Packrat 解析就可以 运行 按原样进行。通过添加到脚本顶部(在导入 pyparsing 之后)来执行此操作:
ParserElement.enablePackrat()
为了 运行 你的测试,我使用了 runTests
。我不清楚为什么需要 BOOLQUERY,因为您只是在解析表达式:
tests = """\
((a or b) and (b and c)) or (a and d)
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
"""
EXPRESSION.runTests(tests)
给出:
((a or b) and (b and c)) or (a and d)
[[[['a', 'or', 'b'], 'and', ['b', 'and', 'c']], 'or', ['a', 'and', 'd']]]
[0]:
[[['a', 'or', 'b'], 'and', ['b', 'and', 'c']], 'or', ['a', 'and', 'd']]
[0]:
[['a', 'or', 'b'], 'and', ['b', 'and', 'c']]
[0]:
['a', 'or', 'b']
[1]:
and
[2]:
['b', 'and', 'c']
[1]:
or
[2]:
['a', 'and', 'd']
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
[[[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']], 'or', [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]]]
[0]:
[[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']], 'or', [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]]
[0]:
[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']]
[0]:
[['smart', '0', 'contract*'], 'and', 'agreement']
[0]:
['smart', '0', 'contract*']
[1]:
and
[2]:
agreement
[1]:
or
[2]:
['enforced', '3', 'without', '3', 'interaction']
[3]:
or
[4]:
['automated', '0', 'escrow']
[1]:
or
[2]:
[['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]
[0]:
['protocol*', 'or', ['Consensus', '0', 'algorithm']]
[0]:
protocol*
[1]:
or
[2]:
['Consensus', '0', 'algorithm']
[1]:
5
[2]:
['agreement', 'and', 'transaction']