如何在 Python 中隔离 ANTLR 解析器的测试用例?

How to isolate test cases for ANTLR parser in Python?

我想使用 Python 和一些测试用例来测试我的 ANTLR 解析器 antlr v4 运行时。我 运行 遇到的问题是关于隔离的 的测试用例。在每个测试用例中,应该解析另一个文件。但 解析器(或流reader——我说不准)似乎连接起来 到目前为止读取的文件。所以在测试用例n中,解析结果 是案例 n-1 和案例 n 的结果的串联(并且递归地 回到第一个测试用例)。

这是一个最小的工作示例:

文件Minimal.g4:

grammar Minimal ;

WS : [ \t\n\r\f] ;

WORD : (~ ([ \n\r\t\f]) )+ ;

text : ( token | WS+ )* ;

token : WORD ;

文件minimal/minimal.py:

from antlr4 import *
from minimal.MinimalListener import MinimalListener

class MinimalListListener(MinimalListener):

    tokens = []

    def exitToken(self, ctx):
        self.tokens.append(ctx.WORD().getText())

文件test_minimal.py:

import unittest
import os
from antlr4 import *
from minimal.MinimalListener import MinimalListener
from minimal.MinimalLexer import MinimalLexer
from minimal.MinimalParser import MinimalParser

from minimal.minimal import MinimalListListener


class MinimalTstBase(unittest.TestCase):

    fname = None
    
    def setUp(self):
        self.parsed = []
        input_stream = FileStream(os.path.join("samples", self.fname), encoding="utf-8")
        lexer = MinimalLexer(input_stream)
        stream = CommonTokenStream(lexer)
        parser = MinimalParser(stream)
        tree = parser.text()    # start rule
        printer = MinimalListListener()
        walker = ParseTreeWalker()
        walker.walk(printer, tree)
        self.parsed = printer.tokens # get result stored in the listener

    def tearDown(self):
        self.parsed = []
        
class TestCite1(MinimalTstBase):

    fname = "cite1.txt"

    def test_parsed(self):
        self.assertEqual(
            self.parsed,
            ["A", "B"])

class TestCite2(MinimalTstBase):

    fname = "cite2.txt"

    def test_parsed(self):
        self.assertEqual(
            self.parsed,
            ['c', 'd'])

文件samples/cite1.txt:

A B

文件samples/cite2.txt:

 c d

文件 minimal/__init__.py 只是一个空文件。 文件 setup.py 只是样板文件:

from setuptools import find_packages, setup

setup(
    name = "minimal-antlr-testing-example",
    version = "0.0.1",
    description = "Minimal testing example",
    classifiers = [
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    packages = find_packages(where="."),
    python_requires = ">=3.6",
    install_requires = [
        "wheel",
        "setuptools",
        "antlr4-python3-runtime==4.9.1"
        ],
    test_suite = 'minimal',
    )

文件requirements.txt:

wheel==0.36.2
antlr4-python3-runtime==4.9.1

运行

之后
java -cp antlr-4.9.1-complete.jar org.antlr.v4.Tool -o minimal/ \
-Xexact-output-dir -Dlanguage="Python3" Minimal.g4

有以下文件:

./minimal/Minimal.tokens
./minimal/MinimalListener.py
./minimal/MinimalParser.py
./minimal/Minimal.interp
./minimal/MinimalLexer.tokens
./minimal/MinimalLexer.py
./minimal/MinimalLexer.interp
./minimal/test_minimal.py
./minimal/minimal.py
./minimal/__init__.py
./samples/cite2.txt
./samples/cite1.txt
./Minimal.g4
./requirements.txt
./setup.py

安装和运行测试:

python3 -m venv env
source env/bin/activate
python -m pip install -r requirements.txt
python setup.py bdist_wheel
python setup.py test

...

test_parsed (minimal.test_minimal.TestCite1) ... ok
test_parsed (minimal.test_minimal.TestCite2) ... FAIL

======================================================================
FAIL: test_parsed (minimal.test_minimal.TestCite2)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/clueck/src/minimal/antlr/minimal/test_minimal.py", line 46, in test_parsed
    ['c', 'd'])
AssertionError: Lists differ: ['A', 'B', 'c', 'd'] != ['c', 'd']

First differing element 0:
'A'
'c'

First list contains 2 additional elements.
First extra element 2:
'c'

- ['A', 'B', 'c', 'd']
+ ['c', 'd']

如何在 运行 单元测试时隔离解析器运行?

亲切的问候

class MinimalListListener(MinimalListener):

    tokens = []

这将创建属于 MinimalListListener class 的单个列表,该列表在 class 的所有实例之间共享。因此,如果您创建 class 的两个实例,它们都将元素添加到列表中,则列表最终将包含来自这两个实例的元素。

要创建特定于 Python 中每个实例的变量,您可以在 __init__ 中设置变量,而不是 class 主体:

class MinimalListListener(MinimalListener):
    def __init__(self):
        self.tokens = []