pyparsing 泛型 python 函数 args 和 kwargs

pyparsing generic python function args and kwargs

我正在尝试为分离出 argskwargs 的通用 Python 函数创建解析器。我查看了 examples 但找不到有帮助的。

这是我想要解析的内容以及我希望在使用 parseString().asDict() 解析后输出的示例。

example = "test(1, 2, 3, hello, a=4, stuff=there, d=5)"
results = xxx.parseString(example).asDict()
results
{'name': 'test', 'args': ['1', '2', '3', 'hello'], 'kwargs': {'a': '4', 'stuff': 'there', 'd': '5'}}

or 
example = "test(a=4, stuff=there, d=5)"
results = xxx.parseString(example).asDict()
results
{'name': 'test', 'args': '', 'kwargs': {'a': '4', 'stuff': 'there', 'd': '5'}}

or
example = "test(1, 2, 3, hello)"
results = xxx.parseString(example).asDict()
results
{'name': 'test', 'args': ['1', '2', '3', 'hello'], 'kwargs': ''}

参数和关键字参数都应该是可选的,我暂时忽略超级通用 *args**kwargs 和输入嵌套列表等。我设法在那里工作只有 args 或 kwargs 但当我同时拥有两者时失败。

import pyparsing as pp

LPAR = pp.Suppress('(')
RPAR = pp.Suppress(')')

# define generic number
number = pp.Regex(r"[+-~]?\d+(:?\.\d*)?(:?[eE][+-]?\d+)?")

# define function arguments
arglist = pp.delimitedList(number | (pp.Word(pp.alphanums + '-_') + pp.NotAny('=')) )
args = pp.Group(arglist).setResultsName('args')

# define function keyword arguments
key = pp.Word(pp.alphas) + pp.Suppress('=')
values = (number | pp.Word(pp.alphas))
keyval = pp.dictOf(key, values)
kwarglist = pp.delimitedList(keyval)
kwargs = pp.Group(kwarglist).setResultsName('kwargs')

# build generic function
fxn_args = pp.Optional(args, default='') + pp.Optional(kwargs, default='')
fxn_name = (pp.Word(pp.alphas)).setResultsName('name')
fxn = pp.Group(fxn_name + LPAR + fxn_args + RPAR)

结果

# parsing only kwargs
fxn.parseString('test(a=4, stuff=there, d=5)')[0].asDict()
{'name': 'test', 'args': '', 'kwargs': {'a': '4', 'stuff': 'there', 'd': '5'}}

# parsing only args
fxn.parseString('test(1, 2, 3, hello)')[0].asDict()
{'name': 'test', 'args': ['1', '2', '3', 'hello'], 'kwargs': ''}

# parsing both
fxn.parseString('test(1, 2, 3, hello, a=4, stuff=there, d=5)')[0].asDict()
...
ParseException: Expected ")", found ','  (at char 19), (line:1, col:20)

如果我只检查 fxn_args 的解析,我会得到 kwargs 完全丢失的

# parse only kwargs
fxn_args.parseString('c=4, stuff=there, d=5.234').asDict()
{'args': '', 'kwargs': {'c': '4', 'stuff': 'there', 'd': '5.234'}}

# parse both args and kwargs
fxn_args.parseString('1, 2, 3, hello, c=4, stuff=there, d=5.234').asDict()
{'args': ['1', '2', '3', 'hello'], 'kwargs': ''}

如果 args 和 kwargs 都存在,则您的解析器会被它们之间的“,”绊倒。

您可以使用 pyparsing 的 runTests 方法亲自查看:

fxn.runTests("""\
    # parsing only kwargs
    test(a=4, stuff=there, d=5)

    # parsing only args
    test(1, 2, 3, hello)

    # parsing both
    test(1, 2, 3, hello, a=4, stuff=there, d=5)
""")

将打印:

# parsing only kwargs
test(a=4, stuff=there, d=5)
[['test', '', [['a', 4], ['stuff', 'there'], ['d', 5]]]]
[0]:
  ['test', '', [['a', 4], ['stuff', 'there'], ['d', 5]]]
  - args: ''
  - kwargs: [['a', 4], ['stuff', 'there'], ['d', 5]]
    - a: 4
    - d: 5
    - stuff: 'there'
  - name: 'test'

# parsing only args
test(1, 2, 3, hello)
[['test', [1, 2, 3, 'hello'], '']]
[0]:
  ['test', [1, 2, 3, 'hello'], '']
  - args: [1, 2, 3, 'hello']
  - kwargs: ''
  - name: 'test'

# parsing both
test(1, 2, 3, hello, a=4, stuff=there, d=5)
                   ^
FAIL: Expected ")", found ','  (at char 19), (line:1, col:20)>Exit code: 0

最容易修复:

fxn_args =  args + ',' + kwargs | pp.Optional(args, default='') + pp.Optional(kwargs, default='')

您可能还会发现标识符不仅仅是单词(字母),还有“_”和数字。 pyparsing 包含的 pyparsing_common 命名空间 class 中有一个标识符表达式:

ppc = pp.pyparsing_common
ident = ppc.identifier()
number = ppc.number()

number 也会自动转换为 int 或 float。