Forward() 的异常早期 EOF 终止 parser_element

Weird early EOF termination of Forward() parser_element

在仔细阅读和调试 pyparsing 示例中的几个 Forward() 特性之后,我根据 ISC Bind9/DHCP 配置文件的需要将其中的几个特性集拼凑在一起:

有一个 EBNF(在这个 Zytrax link 中有详细说明)我在这里苦苦挣扎:

address_match_list = element ; [ element; ... ]

element = [!] (ip [/prefix] | key key-name | "acl_name" | { address_match_list } )

我的最终(但失败的最适合)草案是:

element = Forward()
element <<= (
    # Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
    (0, None) * Word('!') +

    # Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
    # | is matchFirst, not matchLongest
    # ^ is matchLongest
    (
        ZeroOrMore(
            (
                # Typical pattern "1.2.3.4/24;"
                (
                    Combine(
                        pyparsing_common.ipv4_address + '/' + Word(nums, max=3)
                    ) + ';'
                ) ^                                        # Start: '999.999.999.999/99'
                # Typical pattern "2.3.4.5;"
                (pyparsing_common.ipv4_address + ';') ^    # Start: '999.999.999.999'
                # Typical pattern "3210::1;"
                (pyparsing_common.ipv6_address + ';') ^    # Start: 'XXXX:'
                (Keyword('key') + Word(alphanums, max=63) + ';')
                                                           # Start: 'key <key-varname>'
            )
        ) ^
        # Typical pattern "{ 1.2.3.4; };"
        ZeroOrMore('{' - element + '}' + ';')
    ).setParseAction(pushFirst)
).setParseAction(pushExclamation)

而我 运行 element.runTests():

element.runTests('2.2.2.2; { 3.3.3.3; };')
2.2.2.2; { 3.3.3.3; };
         ^
FAIL: Expected end of text, found '{'  (at char 9), (line:1, col:10)

匹配第一个元素后意外的 'expected EOF' 是停止整个解析器的原因。

演示问题的独立代码片段。

#!/usr/bin/env python3
# EBNF detailed at http://www.zytrax.com/books/dns/ch7/address_match_list.html
from pyparsing import *
exprStack = []

def pushFirst(strg, loc, toks):
    exprStack.append(toks[0])

def pushExclamation(strg, loc, toks):
    for t in toks:
        if t == '!':
            exprStack.append('!')
        else:
            break

# Address_Match_List (AML)
# This AML combo is ordered very carefully so that longest pattern are tried firstly
#
# EBNF reiterated here:
#
#    address_match_list = element ; [ element; ... ]
#
#    element = [!] (ip [/prefix] | key key-name | "acl_name" | { address_match_list } )
#
element = Forward()
element <<= (
    # Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
    (0, None) * Word('!') +

    # Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
    # | is matchFirst, not matchLongest
    # ^ is matchLongest
    (
        ZeroOrMore(
            (
                # Typical pattern "1.2.3.4/24;"
                (
                    Combine(
                        pyparsing_common.ipv4_address + '/' + Word(nums, max=3)
                    ) + ';'
                ) ^                                        # Start: '999.999.999.999/99'
                # Typical pattern "2.3.4.5;"
                (pyparsing_common.ipv4_address + ';') ^    # Start: '999.999.999.999'
                # Typical pattern "3210::1;"
                (pyparsing_common.ipv6_address + ';') ^    # Start: 'XXXX:'
                (
                    Keyword('key') + Word(alphanums, max=63) + ';'
                )                                          # Start: 'key <key-variable-name>'
            )
        ) ^
        # Typical pattern "{ 1.2.3.4; };"
        ZeroOrMore('{' + element + '}' + ';')
    ).setParseAction(pushFirst)
).setParseAction(pushExclamation)
element.setName('"element ;"')
element.setDebug()

result = element.runTests("""
123.123.123.123;
!210.210.210.210;
{ 234.234.234.234 };
2.2.2.2; { 3.3.3.3; };
{ 4.4.4.4; }; 5.5.5.5;
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
!{ 9.9.9.9; 10.10.10.10; };
12.12.12.12; !13.13.13.13;
14.14.14.14/15; 16.16.16.16; key MySha512Key;
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key; }
""")

import pprint
pp = pprint.PrettyPrinter(indent=4)
print("Result: ")
pp.pprint(result)

测试 运行 有效语法内容

完成element.runTests()输出:


123.123.123.123;
['123.123.123.123', ';']

!210.210.210.210;
['!', '210.210.210.210', ';']

{ 234.234.234.234 };
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)

2.2.2.2; { 3.3.3.3; };
         ^
FAIL: Expected end of text, found '{'  (at char 9), (line:1, col:10)

{ 4.4.4.4; }; 5.5.5.5;
              ^
FAIL: Expected end of text, found '5'  (at char 14), (line:1, col:15)

{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
                       ^
FAIL: Expected end of text, found '8'  (at char 23), (line:1, col:24)

!{ 9.9.9.9; 10.10.10.10; };
['!', '{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';']

12.12.12.12; !13.13.13.13;
             ^
FAIL: Expected end of text, found '!'  (at char 13), (line:1, col:14)

14.14.14.14/15; 16.16.16.16; key MySha512Key;
['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';']

17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key; }
                ^
FAIL: Expected end of text, found '{'  (at char 16), (line:1, col:17)

漂亮的打印结果是:

Result: 
(   False,
    [   ('123.123.123.123;', (['123.123.123.123', ';'], {})),
        ('!210.210.210.210;', (['!', '210.210.210.210', ';'], {})),
        (   '{ 234.234.234.234 };',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '2.2.2.2; { 3.3.3.3; };',
            Expected end of text, found '{'  (at char 9), (line:1, col:10)),
        (   '{ 4.4.4.4; }; 5.5.5.5;',
            Expected end of text, found '5'  (at char 14), (line:1, col:15)),
        (   '{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;',
            Expected end of text, found '8'  (at char 23), (line:1, col:24)),
        (   '!{ 9.9.9.9; 10.10.10.10; };',
            (['!', '{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';'], {})),
        (   '12.12.12.12; !13.13.13.13;',
            Expected end of text, found '!'  (at char 13), (line:1, col:14)),
        (   '14.14.14.14/15; 16.16.16.16; key MySha512Key;',
            (['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';'], {})),
        (   '17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key; }',
            Expected end of text, found '{'  (at char 16), (line:1, col:17))])

Process finished with exit code 0

我还在慢慢地调试 234.234.234.234;3.3.3.3; 所以我希望有人会在我慢慢调试的时候看一眼然后说 'there it is'。

测试 运行 故意失败的语法

更新:添加了故意失败的语法内容的测试代码:

result = element.runTests("""
20
!
key;
21;
{ 23 };
{ 24.24.24.24 };
{ 25.25.25.25; }
26.26.26.26
27.27.27.27; key
28.28.28.28; { key }
29.29.29.29, 30.30.30.30;
{ 31.31.31.31; 32.32.32.32; }
{ 33.33.33.33; 34.34.34.34; }; 35;
""", failureTests=True)
print("Result of failed contents: ")
pp.pprint(result)

测试运行 失败的内容(漂亮打印格式):

Result of failed contents: 
(   True,
    [   ('20', exception raised in parse action  (at char 0), (line:1, col:1)),
        ('!', exception raised in parse action  (at char 0), (line:1, col:1)),
        (   'key;',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        ('21;', exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '{ 23 };',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '{ 24.24.24.24 };',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '{ 25.25.25.25; }',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '26.26.26.26',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '27.27.27.27; key',
            Expected end of text, found 'k'  (at char 13), (line:1, col:14)),
        (   '28.28.28.28; { key }',
            Expected end of text, found '{'  (at char 13), (line:1, col:14)),
        (   '29.29.29.29, 30.30.30.30;',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '{ 31.31.31.31; 32.32.32.32; }',
            exception raised in parse action  (at char 0), (line:1, col:1)),
        (   '{ 33.33.33.33; 34.34.34.34; }; 35;',
            Expected end of text, found '3'  (at char 31), (line:1, col:32))])

Process finished with exit code 0
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

20
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

!
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

key;
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

21;
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> []
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

{ 23 };
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> []
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

{ 24.24.24.24 };
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['25.25.25.25', ';']
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

{ 25.25.25.25; }
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

26.26.26.26
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Matched "element ;" -> ['27.27.27.27', ';']

27.27.27.27; key
             ^
FAIL: Expected end of text, found 'k'  (at char 13), (line:1, col:14)
Match "element ;" at loc 0(1,1)
Matched "element ;" -> ['28.28.28.28', ';']

28.28.28.28; { key }
             ^
FAIL: Expected end of text, found '{'  (at char 13), (line:1, col:14)
Match "element ;" at loc 0(1,1)
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

29.29.29.29, 30.30.30.30;
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['31.31.31.31', ';', '32.32.32.32', ';']
Exception raised:exception raised in parse action  (at char 0), (line:1, col:1)

{ 31.31.31.31; 32.32.32.32; }
^
FAIL: exception raised in parse action  (at char 0), (line:1, col:1)
Match "element ;" at loc 0(1,1)
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['33.33.33.33', ';', '34.34.34.34', ';']
Match "element ;" at loc 1(1,2)
Matched "element ;" -> ['33.33.33.33', ';', '34.34.34.34', ';']
Matched "element ;" -> ['{', '33.33.33.33', ';', '34.34.34.34', ';', '}', ';']

{ 33.33.33.33; 34.34.34.34; }; 35;
                               ^
FAIL: Expected end of text, found '3'  (at char 31), (line:1, col:

更新:根据 Paul MacG 提供的答案,我已经根据他的建议更新了代码片段。

在此之前,我在我的两个测试 运行 中发现了另外两个错误(有效语法和无效语法);这两个错误都在有效语法测试 运行 中。我已将测试片段更新为:

result = element.runTests("""
123.123.123.123;
!210.210.210.210;
{ 234.234.234.234; };
2.2.2.2; { 3.3.3.3; };
{ 4.4.4.4; }; 5.5.5.5;
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
!{ 9.9.9.9; 10.10.10.10; };
12.12.12.12; !13.13.13.13;
14.14.14.14/15; 16.16.16.16; key MySha512Key;
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;
""")
print("Result of valid contents: ")
pp.pprint(result)

现在测试结果缩小到只有一个失败的语法:

Result of valid contents: 
(   False,
    [   ('123.123.123.123;', (['123.123.123.123', ';'], {})),
        ('!210.210.210.210;', (['!', '210.210.210.210', ';'], {})),
        (   '{ 234.234.234.234; };',
            ([(['{', '234.234.234.234', ';', '}', ';'], {})], {})),
        (   '2.2.2.2; { 3.3.3.3; };',
            (['2.2.2.2', ';', (['{', '3.3.3.3', ';', '}', ';'], {})], {})),
        (   '{ 4.4.4.4; }; 5.5.5.5;',
            ([(['{', '4.4.4.4', ';', '}', ';'], {}), '5.5.5.5', ';'], {})),
        (   '{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;',
            ([(['{', '6.6.6.6', ';', '7.7.7.7', ';', '}', ';'], {}), '8.8.8.8', ';'], {})),
        (   '!{ 9.9.9.9; 10.10.10.10; };',
            (['!', (['{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';'], {})], {})),
        (   '12.12.12.12; !13.13.13.13;',
            Expected end of text, found '!'  (at char 13), (line:1, col:14)),
        (   '14.14.14.14/15; 16.16.16.16; key MySha512Key;',
            (['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';'], {})),
        (   '17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;',
            (['17.17.17.17/18', ';', (['{', '19.19.19.19', ';', '}', ';'], {}), 'key', 'YourSha512Key', ';'], {}))])

这是向前迈出的重要一步。

我注意到以下基本变化:

我们留下了一个与嵌套 element.

中使用的感叹号有关的错误
import pprint
pp = pprint.PrettyPrinter(indent=4)
result = element.runTests("""
12.12.12.12; !13.13.13.13;
""")
print("Result of valid contents: ")
pp.pprint(result)

测试结果为:

Match "element ;" at loc 0(1,1)
Matched "element ;" -> ['12.12.12.12', ';']

12.12.12.12; !13.13.13.13;
             ^
FAIL: Expected end of text, found '!'  (at char 13), (line:1, col:14)
Result of valid contents: 
(   False,
    [   (   '12.12.12.12; !13.13.13.13;',
            Expected end of text, found '!'  (at char 13), (line:1, col:14))])

工作解决方案的最终 运行

在最终的测试代码中,我采纳了 Paul McG 的建议,将感叹号 parser_element 推到 ZeroOrMore 内部,如下所示:

# Address_Match_List (AML)
# This AML combo is ordered very carefully so that longest pattern are tried firstly
#
# EBNF reiterated here:
#
#    address_match_list = element ; [ element; ... ]
#
#    element = [!] (ip [/prefix] | key key-name | "acl_name" | { address_match_list } )
#
element = Forward()
element <<= (
    # Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
    # | is matchFirst, not matchLongest
    # ^ is matchLongest
    ZeroOrMore(
        # Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
        (0, None) * Word('!') +
        (
                (
                        (Combine(pyparsing_common.ipv4_address + '/' + Word(nums, max=3)) + ';')
                        ^ (pyparsing_common.ipv4_address + ';')
                        ^ (pyparsing_common.ipv6_address + ';')
                        ^ (Keyword('key') + Word(alphanums, max=63) + ';')
                        ^ Keyword('acl_name')
                ).setParseAction(pushFirst)
                ^ Group('{' - delimitedList(element, delim=';') + '}' + ';')
        )
    )
).setParseAction(pushExclamation)
element.setName('"element ;"')
element.setDebug()

import pprint

pp = pprint.PrettyPrinter(indent=4)
result = element.runTests("""
123.123.123.123;
!210.210.210.210;
{ 234.234.234.234; };
2.2.2.2; { 3.3.3.3; };
{ 4.4.4.4; }; 5.5.5.5;
{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;
!{ 9.9.9.9; 10.10.10.10; };
12.12.12.12; !13.13.13.13;
14.14.14.14/15; 16.16.16.16; key MySha512Key;
17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;
""")
print("Result of valid contents: ")
pp.pprint(result)

经过上面运行的测试,其有效语法内容的测试结果为:

Result of valid contents: 
(   True,
    [   ('123.123.123.123;', (['123.123.123.123', ';'], {})),
        ('!210.210.210.210;', (['!', '210.210.210.210', ';'], {})),
        (   '{ 234.234.234.234; };',
            ([(['{', '234.234.234.234', ';', '}', ';'], {})], {})),
        (   '2.2.2.2; { 3.3.3.3; };',
            (['2.2.2.2', ';', (['{', '3.3.3.3', ';', '}', ';'], {})], {})),
        (   '{ 4.4.4.4; }; 5.5.5.5;',
            ([(['{', '4.4.4.4', ';', '}', ';'], {}), '5.5.5.5', ';'], {})),
        (   '{ 6.6.6.6; 7.7.7.7; }; 8.8.8.8;',
            ([(['{', '6.6.6.6', ';', '7.7.7.7', ';', '}', ';'], {}), '8.8.8.8', ';'], {})),
        (   '!{ 9.9.9.9; 10.10.10.10; };',
            (['!', (['{', '9.9.9.9', ';', '10.10.10.10', ';', '}', ';'], {})], {})),
        (   '12.12.12.12; !13.13.13.13;',
            (['12.12.12.12', ';', '!', '13.13.13.13', ';'], {})),
        (   '14.14.14.14/15; 16.16.16.16; key MySha512Key;',
            (['14.14.14.14/15', ';', '16.16.16.16', ';', 'key', 'MySha512Key', ';'], {})),
        (   '17.17.17.17/18; { 19.19.19.19; }; key YourSha512Key;',
            (['17.17.17.17/18', ';', (['{', '19.19.19.19', ';', '}', ';'], {}), 'key', 'YourSha512Key', ';'], {}))])

哇。下面的答案解决了这个问题。需要更多地解决它,以便我可以更好地总结 "why".

现在可以轻松完成 ISC 样式配置的其余部分。

这可能会让你更接近,但我不确定它是否正确地处理了堆栈位。

element = Forward()
element <<= (
    # Hide the exclamation so we can do deeper parse cleaner w/o clutter of '!'
    (0, None) * Word('!') +

    # Might be nice to do a bit of lookahead for '.', ':', 'key', and '"'
    # | is matchFirst, not matchLongest
    # ^ is matchLongest
    ZeroOrMore(
        (
            (Combine(pyparsing_common.ipv4_address + '/' + Word(nums, max=3)) + ';')
            ^ (pyparsing_common.ipv4_address + ';')
            ^ (pyparsing_common.ipv6_address + ';')
            ^ (Keyword('key') + Word(alphanums, max=63) + ';')
            ^ Keyword('acl_name')
        ).setParseAction(pushFirst)
        ^ Group('{' - delimitedList(element, delim=';') + '}' + ';')
    )
).setParseAction(pushExclamation)

我已经开始在下一行的开头使用运算符来格式化我的长表达式,这对我来说更具可读性。我猜您可能希望将 {} 中的元素保存在它们自己的子组中,因此我将它们分组。如果你想摆脱混乱,所有这些分号看起来都可以被抑制,如果你适当地构建你的结果。