Python 中用于 PEG 解析器的 NodeVisitor class

NodeVisitor class for PEG parser in Python

想象以下类型的字符串:

if ((a1 and b) or (a2 and c)) or (c and d) or (e and f)

现在,我想获取括号中的表达式,所以我用以下语法编写了一个 PEG 解析器:

from parsimonious.grammar import Grammar

grammar = Grammar(
    r"""
    program     = if expr+
    expr        = term (operator term)*
    term        = (factor operator factor) / factor
    factor      = (lpar word operator word rpar) / (lpar expr rpar)

    if          = "if" ws
    and         = "and"
    or          = "or"
    operator    = ws? (and / or) ws?

    word        = ~"\w+"
    lpar        = "("
    rpar        = ")"

    ws          = ~"\s*"
    """)

解析得很好
tree = grammar.parse(string)

现在问题来了:如何写一个NodeVisitorclass让这棵树只得到因子?我的问题是第二个分支可以深度嵌套。


我试过

def walk(node, level = 0):
    if node.expr.name == "factor":
        print(level * "-", node.text)

    for child in node.children:
        walk(child, level + 1)

walk(tree)

但实际上无济于事(因素重复出现)。
注意:此问题基于 Whosebug 上的 another one

如果你只想return每个最外层的因素,return早点,不要深入到它的children。

def walk(node, level = 0):
    if node.expr.name == "factor":
        print(level * "-", node.text)
        return
    for child in node.children:
        walk(child, level + 1)

输出:

----- ((a1 and b) or (a2 and c))
----- (c and d)
------ (e and f)

How would I go about it to get ((a1 and b) or (a2 and c)), (c and d) and (e and f) as three parts?

当解析树中的节点是 ( 时,您可以创建一个 "listens" 的访问者,其中深度变量增加,并且遇到 ) ,深度变量减小。然后在调用的匹配括号表达式的方法中,检查深度,然后将其添加到访问者 return 的表达式列表中。

这是一个简单的例子:

from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor

grammar = Grammar(
    r"""
    program     = if expr+
    expr        = term (operator term)*
    term        = (lpar expr rpar) / word

    if          = "if" ws
    and         = "and"
    or          = "or"
    operator    = ws? (and / or) ws?

    word        = ~"\w+"
    lpar        = "("
    rpar        = ")"

    ws          = ~"\s*"
    """)


class ParExprVisitor(NodeVisitor):

    def __init__(self):
        self.depth = 0
        self.par_expr = []

    def visit_term(self, node, visited_children):
        if self.depth == 0:
            self.par_expr.append(node.text)

    def visit_lpar(self, node, visited_children):
        self.depth += 1

    def visit_rpar(self, node, visited_children):
        self.depth -= 1

    def generic_visit(self, node, visited_children):
        return self.par_expr


tree = grammar.parse("if ((a1 and b) or (a2 and c)) or (c and d) or (e and f)")
visitor = ParExprVisitor()

for expr in visitor.visit(tree):
    print(expr)

打印:

((a1 and b) or (a2 and c))
(c and d)
(e and f)