具有重复命名标记的 ParseResults 输出结构:如何在命名字典中保持顺序

ParseResults output structure with repeating named tokens: how to keep the order in the named dictionary

让我们考虑一下我创建的以下代码,它反映了我的问题(继我之前的问题:):

from pyparsing import *

line = 'a(1)->b(2)->c(3)->b(4)->a(5)'

LPAR, RPAR = map(Suppress, "()")
num = Word(nums)
SEQOP = Suppress('->')

a = Group(Literal('a')+LPAR+num+RPAR)('ela*')
b = Group(Literal('b')+LPAR+num+RPAR)('elb*')
c = Group(Literal('c')+LPAR+num+RPAR)('elc*')

element = a | b | c

one_seq_expr = Group(element + (SEQOP + element)[...])('one_seq_expr')

out = one_seq_expr.parseString(line)

print(out.dump())

从这段代码中我得到以下结果:

[[['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]]
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
  - ela: [['a', '1'], ['a', '5']]
    [0]:
      ['a', '1']
    [1]:
      ['a', '5']
  - elb: [['b', '2'], ['b', '4']]
    [0]:
      ['b', '2']
    [1]:
      ['b', '4']
  - elc: [['c', '3']]
    [0]:
      ['c', '3']

我们可以通过不同的方式访问结果:

>> out[0]
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr']
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr'][0:4]
[(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {})]
>> for _ in out[0]: print(_)
['a', '1']
['b', '2']
['c', '3']
['b', '4']
['a', '5']
>> out['one_seq_expr']['ela']
([(['a', '1'], {}), (['a', '5'], {})], {})

ParseResults 对象 out['one_seq_expr'] 保持找到的不同标记的顺序。另一方面,命名标记的结构是按名称对它们进行分组,并保持每个名称出现的顺序。

是否有可能获得一种输出结构,其中在不同元素之间保持顺序,同时保持名称以某种形式存在?类似于:

- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
  - ela_0: [['a', '1']]
    [0]:
      ['a', '1']
  - elb_0: [['b', '2']]
    [0]:
      ['b', '2']
  - elc_0: [['c', '3']]
    [0]:
      ['c', '3']
  - elb_1: [['b', '4']]
    [0]:
      ['b', '4']
  - ela_0: [['a', '5']]
    [0]:
      ['a', '5']

或者我们是否必须在有序的令牌列表 out['one_seq_expr'] 上使用 ParseResults.getName()?如:

>> [_.getName() for _ in out['one_seq_expr']]
['ela', 'elb', 'elc', 'elb', 'ela']

您可以使用解析操作用它们各自的类型来注释这些元素,这些元素将保留在每个元素中:

a.addParseAction(lambda t: t[0].insert(0, "ELA_TYPE"))
b.addParseAction(lambda t: t[0].insert(0, "ELB_TYPE"))
c.addParseAction(lambda t: t[0].insert(0, "ELC_TYPE"))

用这些表达式解析并转储结果给出(手动重新格式化):

- one_seq_expr: [['ELA_TYPE', 'a', '1'], 
                 ['ELB_TYPE', 'b', '2'], 
                 ['ELC_TYPE', 'c', '3'], 
                 ['ELB_TYPE', 'b', '4'], 
                 ['ELA_TYPE', 'a', '5']]
   ... etc. ...