使 while 解析函数递归

Make a while parsing function recursive

我有一个解析 lisp 表达式的基本函数。它使用 while 循环,但作为练习,我想将其转换为递归函数。但是,这对我来说有点棘手。这是我到目前为止所拥有的:

def build_ast(self, tokens=None):
    # next two lines example input to make self-contained
    LEFT_PAREN, RIGHT_PAREN = '(', ')'
    tokens = ['(', '+', '2', '(', '*', '3', '4', ')', ')']
    while RIGHT_PAREN in tokens:
        right_idx = tokens.index(RIGHT_PAREN)
        left_idx = right_idx - tokens[:right_idx][::-1].index(LEFT_PAREN)-1
        extraction = [tokens[left_idx+1:right_idx],]
        tokens = tokens[:left_idx] + extraction + tokens[right_idx+1:]
    ast = tokens
    return ast

所以它会解析这样的东西:

(+ 2 (* 3 4))

进入这个:

[['+', '2', ['*', '3', '4']]]

如何使上述函数递归的示例是什么?到目前为止,我已经开始使用类似的东西:

def build_ast(self, ast=None):
    if ast is None: ast=self.lexed_tokens
    if RIGHT_PAREN not in ast:
        return ast
    else:
        right_idx = ast.index(RIGHT_PAREN)
        left_idx = right_idx - ast[:right_idx][::-1].index(LEFT_PAREN)-1
        ast = ast[:left_idx] + [ast[left_idx+1:right_idx],] + ast[right_idx+1:]
        return self.build_ast(ast)

但它给人的感觉有点奇怪(好像递归在这里没有用)。构建它的更好方法是什么?或者可能是一个 better/more 优雅的算法来构建这个简单的 ast?

您可以使用递归生成器函数:

def _build_ast(tokens):
   LEFT_PAREN, RIGHT_PAREN = '(', ')'
   #consume the iterator until it is empty or a right paren occurs
   while (n:=next(tokens, None)) is not None and n != RIGHT_PAREN:
      #recursively call _build_ast if we encounter a left paren
      yield n if n != LEFT_PAREN else list(_build_ast(tokens))
   

def build_ast(tokens):
   #pass tokens as an iterator to _build_ast
   return list(_build_ast(iter(tokens)))

tokens = ['(', '+', '2', '(', '*', '3', '4', ')', ')']
print(build_ast(tokens))

输出:

[['+', '2', ['*', '3', '4']]]

与其他答案类似,我会将结束当前表达式的标记传递给递归函数。这通常是右括号,但对于第一次调用,它将是输入结束 (None)。

def build_ast(tokens):
    LEFT_PAREN, RIGHT_PAREN = '(', ')'
    it = iter(tokens)  # Iterator over the input
    
    # Recursive (generator) function that processes tokens until the close 
    #   of the expression, i.e until the given token is encountered
    def recur(until=RIGHT_PAREN):
        # Keep processing tokens until closing token is encountered
        while (token := next(it, None)) != until:
            # If parenthesis opens, recur and convert to list
            #    otherwise just yield the token as-is
            yield list(recur()) if token == LEFT_PAREN else token

    # Main recursive call: process until end of input (i.e. until None)
    return list(recur(None))

呼叫为:

ast = build_ast(['(', '+', '2', '(', '*', '3', '4', ')', ')'])

其他两种方法都很棒,这里还有一种:

# Helper function: pop from left or return default
pops = lambda l, d=None: l.pop(0) if l else d

def read_from_tokens(tokens):
    L = []
    while (token := pops(tokens, ')')) != ')':
        L.append(token if token!='(' else read_from_tokens(tokens))
    return L