将 BNF 语法转换为 pyparsing

Question

我如何使用正则表达式（或者 pyparsing 更好？）来描述以下脚本语言的语法（Backus–Naur 形式）：

<root>   :=     <tree> | <leaves>
<tree>   :=     <group> [* <group>] 
<group>  :=     "{" <leaves> "}" | <leaf>;
<leaves> :=     {<leaf>;} leaf
<leaf>   :=     <name> = <expression>{;}

<name>          := <string_without_spaces_and_tabs>
<expression>    := <string_without_spaces_and_tabs>

脚本示例：

{
 stage = 3;
 some.param1 = [10, 20];
} *
{
 stage = 4;
 param3 = [100,150,200,250,300]
} *
 endparam = [0, 1]

我使用 python re.compile 并希望将所有内容分组，如下所示：

[ [ 'stage',       '3'],
  [ 'some.param1', '[10, 20]'] ],

[ ['stage',  '4'],
  ['param3', '[100,150,200,250,300]'] ],

[ ['endparam', '[0, 1]'] ]

更新： 我发现 pyparsing 是比正则表达式更好的解决方案。

Answer 1

Pyparsing 可让您简化其中一些类型的构造

leaves :: {leaf} leaf

只是

OneOrMore(leaf)

所以你的 BNF 在 pyparsing 中的一种形式看起来像：

from pyparsing import *

LBRACE,RBRACE,EQ,SEMI = map(Suppress, "{}=;")
name = Word(printables, excludeChars="{}=;")
expr = Word(printables, excludeChars="{}=;") | quotedString

leaf = Group(name + EQ + expr + SEMI)
group = Group(LBRACE + ZeroOrMore(leaf) + RBRACE) | leaf
tree = OneOrMore(group)

我添加了 quotedString 作为替代 expr，以防你想要 did 包含排除的字符之一的内容。并且在叶子和组周围添加组将保持支撑结构。

很遗憾，您的样本不太符合此 BNF：

[10, 20] 和 [0, 1] 中的空格使它们无效 exprs
有些叶子没有终止符;s
单独 * 个字符 - ???

此示例使用上述解析器成功解析：

sample = """
{
 stage = 3;
 some.param1 = [10,20];
}
{
 stage = 4;
 param3 = [100,150,200,250,300];
}
 endparam = [0,1];
 """

parsed = tree.parseString(sample)    
parsed.pprint()

给予：

[[['stage', '3'], ['some.param1', '[10,20]']],
 [['stage', '4'], ['param3', '[100,150,200,250,300]']],
 ['endparam', '[0,1]']]

将 BNF 语法转换为 pyparsing

Convert BNF grammar to pyparsing

python

regex

bnf

pyparsing