使用 Python 和 pyparsing 解析 Visual Basic 函数的参数列表
Parse parameter list of a Visual Basic function with Python and pyparsing
我正在尝试使用 pyparsing 解析 Visual Basic (VBA) 函数声明,以将它们转换为 Python 语法。
通常的 VBA 函数头不是主要问题,对我来说工作正常。但是我对参数列表有困难:
Public Function MyFuncName(first As Integer, Second As String) As Integer
参数由逗号分隔的零到多个部分组成,例如:
VarName
VarName As VarType
Optional VarName As VarType = InitValue
ByVal VarName As VarType
其中 "Optional"、"ByVal" 和 "ByRef" 以及类型声明都是完全可选的。
我的想法是通过
从原始行中提取完整的参数列表
allparams = Regex('[^)]*').setResultsName('params')
然后分别解析。这匹配单个参数:
variablename = Word(alphas + '_', alphanums + '_')
typename = variablename.setResultsName('type')
default_value = Word(alphanums)
optional_term = oneOf('Optional', True)
byval_term = oneOf('ByRef ByVal', True)
paramsparser = Optional(optional_term) \
+Optional(byval_term) \
+variablename.setResultsName('pname', True) \
+Optional('As' + typename) \
+Optional('=' + default_value)
但即使 delimitedList(paramsparser)
我也只得到第一个。
AssertionError: 'def test(one):\n\tpass' != 'def test(one, two):\n\tpass'
- def test(one):
+ def test(one, two):
? +++++
你有什么想法吗?
我几乎使用了你发布的代码,并将其包装在 delimitedList
中并获得了两个参数:
paramsparser = Optional(optional_term) \
+Optional(byval_term) \
+variablename.setResultsName('pname', True) \
+Optional('As' + typename) \
+Optional('=' + default_value)
parser = "(" + delimitedList(paramsparser) + ")"
parser.runTests("""\
(one, two)
(ByRef one As Int = 1, Optional ByVal two As Char)
""")
打印:
(one, two)
['(', 'one', 'two', ')']
- pname: ['one', 'two']
(ByRef one As Int = 1, Optional ByVal two As Char)
['(', 'ByRef', 'one', 'As', 'Int', '=', '1', 'Optional', 'ByVal', 'two', 'As', 'Char', ')']
- pname: ['one', 'two']
- type: 'Char'
但是由于每个参数有很多字段,我建议给每个字段一个单独的结果名称并用组包裹起来,以防止参数相互踩踏。这是我对你的解析器的修改(你为不同的可选声明字段发布了各种形式非常有帮助):
from pyparsing import (Word, alphas, alphanums, quotedString, Keyword, Group, Optional, oneOf, delimitedList,
Suppress, pyparsing_common as ppc)
LPAR, RPAR, EQ = map(Suppress, "()=")
OPTIONAL, BYREF, BYVAL, AS, FUNCTION = map(Keyword, "Optional ByRef ByVal As Function".split())
# think abstract for expression names, like 'identifier' not 'variablename'; then
# you can use identifier for the variable name, the function name, as a possible
# var type, etc.
identifier = Word(alphas + "_", alphanums + "_")
rvalue = ppc.number() | quotedString() | identifier()
type_expr = identifier()
# add results names when assembling in groups
param_expr = Group(
Optional(OPTIONAL("optional"))
+ Optional(BYREF("byref") | BYVAL("byval"))
+ identifier("pname")
+ Optional(AS + type_expr("ptype"))
+ Optional(EQ + rvalue("default"))
)
然后,我不会使用正则表达式获取参数然后在单独的步骤中重新解析,而是将其包含在整个函数表达式定义中:
protection = oneOf("Public Private", asKeyword=True)
func_expr = (
protection("protection")
+ FUNCTION
+ identifier("fname")
+ Group(LPAR + delimitedList(param_expr) + RPAR)("parameters")
+ Optional(AS + type_expr("return_type"))
)
tests = """
Public Function MyFuncName(first As Integer, Second As String) As Integer
"""
func_expr.runTests(tests)
打印:
Public Function MyFuncName(first As Integer, Second As String) As Integer
['Public', 'Function', 'MyFuncName', [['first', 'As', 'Integer'], ['Second', 'As', 'String']], 'As', 'Integer']
- fname: 'MyFuncName'
- parameters: [['first', 'As', 'Integer'], ['Second', 'As', 'String']]
[0]:
['first', 'As', 'Integer']
- pname: 'first'
- ptype: 'Integer'
[1]:
['Second', 'As', 'String']
- pname: 'Second'
- ptype: 'String'
- protection: 'Public'
- return_type: 'Integer'
我正在尝试使用 pyparsing 解析 Visual Basic (VBA) 函数声明,以将它们转换为 Python 语法。
通常的 VBA 函数头不是主要问题,对我来说工作正常。但是我对参数列表有困难:
Public Function MyFuncName(first As Integer, Second As String) As Integer
参数由逗号分隔的零到多个部分组成,例如:
VarName
VarName As VarType
Optional VarName As VarType = InitValue
ByVal VarName As VarType
其中 "Optional"、"ByVal" 和 "ByRef" 以及类型声明都是完全可选的。
我的想法是通过
从原始行中提取完整的参数列表allparams = Regex('[^)]*').setResultsName('params')
然后分别解析。这匹配单个参数:
variablename = Word(alphas + '_', alphanums + '_')
typename = variablename.setResultsName('type')
default_value = Word(alphanums)
optional_term = oneOf('Optional', True)
byval_term = oneOf('ByRef ByVal', True)
paramsparser = Optional(optional_term) \
+Optional(byval_term) \
+variablename.setResultsName('pname', True) \
+Optional('As' + typename) \
+Optional('=' + default_value)
但即使 delimitedList(paramsparser)
我也只得到第一个。
AssertionError: 'def test(one):\n\tpass' != 'def test(one, two):\n\tpass'
- def test(one):
+ def test(one, two):
? +++++
你有什么想法吗?
我几乎使用了你发布的代码,并将其包装在 delimitedList
中并获得了两个参数:
paramsparser = Optional(optional_term) \
+Optional(byval_term) \
+variablename.setResultsName('pname', True) \
+Optional('As' + typename) \
+Optional('=' + default_value)
parser = "(" + delimitedList(paramsparser) + ")"
parser.runTests("""\
(one, two)
(ByRef one As Int = 1, Optional ByVal two As Char)
""")
打印:
(one, two)
['(', 'one', 'two', ')']
- pname: ['one', 'two']
(ByRef one As Int = 1, Optional ByVal two As Char)
['(', 'ByRef', 'one', 'As', 'Int', '=', '1', 'Optional', 'ByVal', 'two', 'As', 'Char', ')']
- pname: ['one', 'two']
- type: 'Char'
但是由于每个参数有很多字段,我建议给每个字段一个单独的结果名称并用组包裹起来,以防止参数相互踩踏。这是我对你的解析器的修改(你为不同的可选声明字段发布了各种形式非常有帮助):
from pyparsing import (Word, alphas, alphanums, quotedString, Keyword, Group, Optional, oneOf, delimitedList,
Suppress, pyparsing_common as ppc)
LPAR, RPAR, EQ = map(Suppress, "()=")
OPTIONAL, BYREF, BYVAL, AS, FUNCTION = map(Keyword, "Optional ByRef ByVal As Function".split())
# think abstract for expression names, like 'identifier' not 'variablename'; then
# you can use identifier for the variable name, the function name, as a possible
# var type, etc.
identifier = Word(alphas + "_", alphanums + "_")
rvalue = ppc.number() | quotedString() | identifier()
type_expr = identifier()
# add results names when assembling in groups
param_expr = Group(
Optional(OPTIONAL("optional"))
+ Optional(BYREF("byref") | BYVAL("byval"))
+ identifier("pname")
+ Optional(AS + type_expr("ptype"))
+ Optional(EQ + rvalue("default"))
)
然后,我不会使用正则表达式获取参数然后在单独的步骤中重新解析,而是将其包含在整个函数表达式定义中:
protection = oneOf("Public Private", asKeyword=True)
func_expr = (
protection("protection")
+ FUNCTION
+ identifier("fname")
+ Group(LPAR + delimitedList(param_expr) + RPAR)("parameters")
+ Optional(AS + type_expr("return_type"))
)
tests = """
Public Function MyFuncName(first As Integer, Second As String) As Integer
"""
func_expr.runTests(tests)
打印:
Public Function MyFuncName(first As Integer, Second As String) As Integer
['Public', 'Function', 'MyFuncName', [['first', 'As', 'Integer'], ['Second', 'As', 'String']], 'As', 'Integer']
- fname: 'MyFuncName'
- parameters: [['first', 'As', 'Integer'], ['Second', 'As', 'String']]
[0]:
['first', 'As', 'Integer']
- pname: 'first'
- ptype: 'Integer'
[1]:
['Second', 'As', 'String']
- pname: 'Second'
- ptype: 'String'
- protection: 'Public'
- return_type: 'Integer'