pyparsing 中 setResultName 的问题

Question

我在使用 pyparsing 解析算术表达式时遇到问题。我有以下语法：

numeric_value = (integer_format | float_format | bool_format)("value*")
identifier = Regex('[a-zA-Z_][a-zA-Z_0-9]*')("identifier*")

operand = numeric_value | identifier

expop = Literal('^')("op")
signop = oneOf('+ -')("op")
multop = oneOf('* /')("op")
plusop = oneOf('+ -')("op")
factop = Literal('!')("op")

arithmetic_expr = infixNotation(operand,
    [("!", 1, opAssoc.LEFT),
     ("^", 2, opAssoc.RIGHT),
     (signop, 1, opAssoc.RIGHT),
     (multop, 2, opAssoc.LEFT),
     (plusop, 2, opAssoc.LEFT),]
    )("expr")

我想用它来解析算术表达式，例如

expr = "9 + 2 * 3"
parse_result = arithmetic_expr.parseString(expr)

我这里有两个问题。

首先，当我转储结果时，我得到以下信息：

[['9', '+', ['2', '*', '3']]]
- expr: ['9', '+', ['2', '*', '3']]
  - op: '+'
  - value: ['9']

对应的XML输出为：

<result>
  <expr>
    <value>9</value>
    <op>+</op>
    <value>
      <value>2</value>
      <op>*</op>
      <value>3</value>
    </value>
  </expr>
</result>

我想要的是 ['2', '*', '3'] 显示为 expr，即

<result>
  <expr>
    <value>9</value>
    <op>+</op>
    <expr>
      <value>2</value>
      <op>*</op>
      <value>3</value>
    </expr>
  </expr>
</result>

但是，我不确定如何使用 setResultName() 来实现这一点。

其次，不幸的是，当我想遍历结果时，我获得了简单部分的字符串。因此，我使用 XML "hack" 作为解决方法（我从这里得到了这个想法：`pyparsing`: iterating over `ParsedResults` 现在有更好的方法吗？

此致载脂蛋白

我还有一个关于如何解析结果的小问题。我的第一次尝试是使用循环，例如

def recurse_arithmetic_expression(tokens):
    for t in tokens:
        if t.getResultName() == "value":
            pass # do something...
        elif t.getResultName() == "identifier":
            pass # do something else..
        elif t.getResultName() == "op":
            pass # do something completely different...
        elif isinstance(t, ParseResults):
            recurse_arithmetic_expression(t)

然而，不幸的是 t 可以是字符串或 int/float。因此，当我尝试调用 getResultName 时出现异常。不幸的是，当我使用 asDict 时，标记的顺序丢失了。

是否有可能获得一个 ordered 字典并用类似

的东西迭代它的键

for tag, token in tokens.iteritems():

其中 tag 指定令牌的类型（例如，op, value, identifier, expr...）并且 token 是相应的令牌？

Answer 1

如果您希望 pyparsing 将数字字符串转换为整数，您可以添加一个解析操作以在解析时完成。或者，使用 pyparsing_common 中定义的预定义整数和浮点值（使用 pyparsing 导入的命名空间 class）：

numeric_value = (pyparsing_common.number | bool_format)("value*")

对于您的命名问题，您可以添加解析操作以在每个级别的 infixNotation 中获取运行 - 在下面的代码中，我添加了一个仅添加 'expr' 名称的解析操作到当前解析组。您还需要向所有操作添加“*”，以便重复的操作符对结果名称获得相同的 "keep all, not just the last" 行为：

bool_format = oneOf("true false")
numeric_value = (pyparsing_common.number | bool_format)("value*")
identifier = Regex('[a-zA-Z_][a-zA-Z_0-9]*')("identifier*")

operand = numeric_value | identifier

expop = Literal('^')("op*")
signop = oneOf('+ -')("op*")
multop = oneOf('* /')("op*")
plusop = oneOf('+ -')("op*")
factop = Literal('!')("op*")


def add_name(s,l,t):
    t['expr'] = t[0]

arithmetic_expr = infixNotation(operand,
    [("!", 1, opAssoc.LEFT, add_name),
     ("^", 2, opAssoc.RIGHT, add_name),
     (signop, 1, opAssoc.RIGHT, add_name),
     (multop, 2, opAssoc.LEFT, add_name),
     (plusop, 2, opAssoc.LEFT, add_name),]
    )("expr")

查看这些结果现在的样子：

arithmetic_expr.runTests("""
    9 + 2 * 3 * 7
""")

print(arithmetic_expr.parseString('9+2*3*7').asXML())

给出：

9 + 2 * 3 * 7
[[9, '+', [2, '*', 3, '*', 7]]]
- expr: [9, '+', [2, '*', 3, '*', 7]]
  - expr: [2, '*', 3, '*', 7]
    - op: ['*', '*']
    - value: [2, 3, 7]
  - op: ['+']
  - value: [9]


<expr>
  <expr>
    <value>9</value>
    <op>+</op>
    <expr>
      <value>2</value>
      <op>*</op>
      <value>3</value>
      <op>*</op>
      <value>7</value>
    </expr>
  </expr>
</expr>

注意：我通常不鼓励人们使用 asXML，因为它必须进行相当多的猜测才能创建其输出。您最好手动浏览解析的结果。此外，请查看 pyparsing wiki 示例页面上的一些示例，尤其是 SimpleBool.py，它使用 classes 进行 infixNotation 中使用的每级解析操作。

编辑::

在这一点上，我真的想劝阻您继续使用结果名称来指导对解析结果的评估。请查看这两种递归已解析标记的方法（请注意，您要查找的方法是 getName，而不是 getResultName）：

result = arithmetic_expr.parseString('9 + 2 * 4 * 6')

def iterate_over_parsed_expr(tokens):
    for t in tokens:
        if isinstance(t, ParseResults):
            tag = t.getName()
            print(t, 'is', tag)
            iterate_over_parsed_expr(t)
        else:
            print(t, 'is', type(t))

iterate_over_parsed_expr(result)

import operator
op_map = {
    '+' : operator.add,
    '-' : operator.sub,
    '*' : operator.mul,
    '/' : operator.truediv
    }
def eval_parsed_expr(tokens):
    t = tokens
    if isinstance(t, ParseResults):
        # evaluate initial value as left-operand
        cur_value = eval_parsed_expr(t[0])
        # iterate through remaining tokens, as operator-operand pairs
        for op, operand in zip(t[1::2], t[2::2]):
            # look up the correct binary function for operator
            op_func = op_map[op]
            # evaluate function, and update cur_value with result
            cur_value = op_func(cur_value, eval_parsed_expr(operand))

        # no more tokens, return the value
        return cur_value
    else:
        # token is just a scalar int or float, just return it
        return t

print(eval_parsed_expr(result))  # gives 57, which I think is the right answer

eval_parsed_expr 依赖于解析标记的结构，而不是结果名称。对于这种有限的情况，标记都是二元运算符，因此对于每个嵌套结构，生成的标记是 "value [op value]..."，并且值本身可以是整数、浮点数或嵌套的 ParseResults - 但绝不是 strs，至少不是我在此方法中硬编码的 4 个二元运算符。与其尝试特例自己来处理一元操作和右关联操作，不如看看在 eval_arith.py (http://pyparsing.wikispaces.com/file/view/eval_arith.py/68273277/eval_arith.py) 中是如何通过关联求值器 classes 来完成的到每个操作数类型，以及中缀符号的每个级别。

pyparsing 中 setResultName 的问题

Problems with setResultName in pyparsing

pyparsing