如何创建一个考虑“|”的抽象语法树? (层/ Yacc)

How can I create an abstract syntax tree considering '|'? (Ply / Yacc)

考虑以下语法:

expr : expr '+' term | expr '-' term | term
term : term '*' factor | term '/' factor | factor
factor : '(' expr ')' | identifier | number

这是我使用 ply 的代码:

from ply import lex, yacc

tokens = [
    "identifier",
    "number",
    "plus",
    "minus",
    "mult",
    "div"
]

t_ignore = r" \t"
t_identifier = r"^[a-zA-Z]+$"
t_number = r"[+-]?(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?"
t_plus = r"\+"
t_minus = r"-"
t_mult = r"\*"
t_div = r"/"

def p_stmt(p):
    """stmt : expr"""
    p[0] = ("stmt", p[1])

def p_expr(p):
    """expr : expr plus term 
            | expr minus term 
            | term"""
    p[0] = ("expr", p[1], p[2]) # Problem here <<<

def p_term(p):
    """term : term mult factor 
            | term div factor 
            | factor"""

def p_factor(p):
    """factor : '(' expr ')' 
              | identifier 
              | number"""


if __name__ == "__main__":
    lex.lex()
    yacc.yacc()
    data = "32 + 10"
    result = yacc.parse(data)
    print(result)

如果我无法访问运算符,我应该如何使用表达式构建 AST?我可以像 p_expr_plus 这样分隔函数,但在这种情况下,我会消除运算符优先级。 docs are not so helpful, since I'm a beginner and can't solve this problem. The best material I've found on the subject is this,但是没有考虑运算符优先级的复杂度。

编辑:我无法访问 p2 or p[3], since I get an IndexError (It's matching the term only). In the PDF I've linked, they explicitly put the operator inside the tuple, like: ('+', p1, p2),因此,证明了我考虑优先级的问题(我无法分离函数,表达式是表达式,应该有一种方法可以考虑管道并访问任何操作员)。

据我所知,在 p[0] = ("expr", p[1], p[2]) 中,p[1] 是左手表达式,p[2] 是运算符,p[3](你没有使用)将是正确的术语。

只需使用 p[2] 来确定运算符,添加 p[3],因为您将需要它,您应该可以开始了。

此外,您必须验证 p 有多少项目,因为如果最后一条规则 | term""" 匹配,p 将只有两个项目而不是四个。

查看 GardenSnake example:

中的片段
def p_comparison(p):
    """comparison : comparison PLUS comparison
                  | comparison MINUS comparison
                  | comparison MULT comparison
                  | comparison DIV comparison
                  | comparison LT comparison
                  | comparison EQ comparison
                  | comparison GT comparison
                  | PLUS comparison
                  | MINUS comparison
                  | power"""
    if len(p) == 4:
        p[0] = binary_ops[p[2]]((p[1], p[3]))
    elif len(p) == 3:
        p[0] = unary_ops[p[1]](p[2])
    else:
        p[0] = p[1]