具有重复命名标记的 ParseResults 输出结构:如何在命名字典中保持顺序
ParseResults output structure with repeating named tokens: how to keep the order in the named dictionary
让我们考虑一下我创建的以下代码,它反映了我的问题(继我之前的问题:):
from pyparsing import *
line = 'a(1)->b(2)->c(3)->b(4)->a(5)'
LPAR, RPAR = map(Suppress, "()")
num = Word(nums)
SEQOP = Suppress('->')
a = Group(Literal('a')+LPAR+num+RPAR)('ela*')
b = Group(Literal('b')+LPAR+num+RPAR)('elb*')
c = Group(Literal('c')+LPAR+num+RPAR)('elc*')
element = a | b | c
one_seq_expr = Group(element + (SEQOP + element)[...])('one_seq_expr')
out = one_seq_expr.parseString(line)
print(out.dump())
从这段代码中我得到以下结果:
[[['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]]
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
- ela: [['a', '1'], ['a', '5']]
[0]:
['a', '1']
[1]:
['a', '5']
- elb: [['b', '2'], ['b', '4']]
[0]:
['b', '2']
[1]:
['b', '4']
- elc: [['c', '3']]
[0]:
['c', '3']
我们可以通过不同的方式访问结果:
>> out[0]
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr']
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr'][0:4]
[(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {})]
>> for _ in out[0]: print(_)
['a', '1']
['b', '2']
['c', '3']
['b', '4']
['a', '5']
>> out['one_seq_expr']['ela']
([(['a', '1'], {}), (['a', '5'], {})], {})
ParseResults 对象 out['one_seq_expr']
保持找到的不同标记的顺序。另一方面,命名标记的结构是按名称对它们进行分组,并保持每个名称出现的顺序。
是否有可能获得一种输出结构,其中在不同元素之间保持顺序,同时保持名称以某种形式存在?类似于:
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
- ela_0: [['a', '1']]
[0]:
['a', '1']
- elb_0: [['b', '2']]
[0]:
['b', '2']
- elc_0: [['c', '3']]
[0]:
['c', '3']
- elb_1: [['b', '4']]
[0]:
['b', '4']
- ela_0: [['a', '5']]
[0]:
['a', '5']
或者我们是否必须在有序的令牌列表 out['one_seq_expr']
上使用 ParseResults.getName()
?如:
>> [_.getName() for _ in out['one_seq_expr']]
['ela', 'elb', 'elc', 'elb', 'ela']
您可以使用解析操作用它们各自的类型来注释这些元素,这些元素将保留在每个元素中:
a.addParseAction(lambda t: t[0].insert(0, "ELA_TYPE"))
b.addParseAction(lambda t: t[0].insert(0, "ELB_TYPE"))
c.addParseAction(lambda t: t[0].insert(0, "ELC_TYPE"))
用这些表达式解析并转储结果给出(手动重新格式化):
- one_seq_expr: [['ELA_TYPE', 'a', '1'],
['ELB_TYPE', 'b', '2'],
['ELC_TYPE', 'c', '3'],
['ELB_TYPE', 'b', '4'],
['ELA_TYPE', 'a', '5']]
... etc. ...
让我们考虑一下我创建的以下代码,它反映了我的问题(继我之前的问题:
from pyparsing import *
line = 'a(1)->b(2)->c(3)->b(4)->a(5)'
LPAR, RPAR = map(Suppress, "()")
num = Word(nums)
SEQOP = Suppress('->')
a = Group(Literal('a')+LPAR+num+RPAR)('ela*')
b = Group(Literal('b')+LPAR+num+RPAR)('elb*')
c = Group(Literal('c')+LPAR+num+RPAR)('elc*')
element = a | b | c
one_seq_expr = Group(element + (SEQOP + element)[...])('one_seq_expr')
out = one_seq_expr.parseString(line)
print(out.dump())
从这段代码中我得到以下结果:
[[['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]]
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
- ela: [['a', '1'], ['a', '5']]
[0]:
['a', '1']
[1]:
['a', '5']
- elb: [['b', '2'], ['b', '4']]
[0]:
['b', '2']
[1]:
['b', '4']
- elc: [['c', '3']]
[0]:
['c', '3']
我们可以通过不同的方式访问结果:
>> out[0]
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr']
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr'][0:4]
[(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {})]
>> for _ in out[0]: print(_)
['a', '1']
['b', '2']
['c', '3']
['b', '4']
['a', '5']
>> out['one_seq_expr']['ela']
([(['a', '1'], {}), (['a', '5'], {})], {})
ParseResults 对象 out['one_seq_expr']
保持找到的不同标记的顺序。另一方面,命名标记的结构是按名称对它们进行分组,并保持每个名称出现的顺序。
是否有可能获得一种输出结构,其中在不同元素之间保持顺序,同时保持名称以某种形式存在?类似于:
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
- ela_0: [['a', '1']]
[0]:
['a', '1']
- elb_0: [['b', '2']]
[0]:
['b', '2']
- elc_0: [['c', '3']]
[0]:
['c', '3']
- elb_1: [['b', '4']]
[0]:
['b', '4']
- ela_0: [['a', '5']]
[0]:
['a', '5']
或者我们是否必须在有序的令牌列表 out['one_seq_expr']
上使用 ParseResults.getName()
?如:
>> [_.getName() for _ in out['one_seq_expr']]
['ela', 'elb', 'elc', 'elb', 'ela']
您可以使用解析操作用它们各自的类型来注释这些元素,这些元素将保留在每个元素中:
a.addParseAction(lambda t: t[0].insert(0, "ELA_TYPE"))
b.addParseAction(lambda t: t[0].insert(0, "ELB_TYPE"))
c.addParseAction(lambda t: t[0].insert(0, "ELC_TYPE"))
用这些表达式解析并转储结果给出(手动重新格式化):
- one_seq_expr: [['ELA_TYPE', 'a', '1'],
['ELB_TYPE', 'b', '2'],
['ELC_TYPE', 'c', '3'],
['ELB_TYPE', 'b', '4'],
['ELA_TYPE', 'a', '5']]
... etc. ...