正确压制评论

Question

我想从文本文件中过滤掉以散列 # 开头的评论，然后再运行一个更大的解析器对其进行处理。

为此，我使用了提到的抑制。

pythonStyleComment 不起作用，因为它会忽略引号并删除其中的内容。带引号的字符串中的散列不是注释。它是字符串的一部分，因此应该保留。

这是我的 pytest，我已经实施它来测试预期的行为。

def test_filter_comment():
    teststrings = [
        '# this is comment', 'Option "sadsadlsad#this is not a comment"'
    ]
    expected = ['', 'Option "sadsadlsad#this is not a comment"']

    for i, teststring in enumerate(teststrings):
        result = filter_comments.transformString(teststring)
        assert result == expected[i]

我当前的实现在 pyparsing 的某个地方出现问题。我可能做了一些不该做的事：

filter_comments = Regex(r"#.*")
filter_comments = filter_comments.suppress()
filter_comments = filter_comments.ignore(QuotedString)

失败：

*****/lib/python3.7/site-packages/pyparsing.py:4480: in ignore
    super(ParseElementEnhance, self).ignore(other)
*****/lib/python3.7/site-packages/pyparsing.py:2489: in ignore
    self.ignoreExprs.append(Suppress(other.copy()))
E   TypeError: copy() missing 1 required positional argument: 'self'

任何有关如何正确忽略评论的帮助都会有所帮助。

Answer 1

您使用的正则表达式不正确。

我想你的意思是：

^\#.*

或

^(?:.*)\#.*

Answer 2

啊，我离得太近了。我当然必须按预期正确实例化 QuotedString class.The 以下作品：

filter_comments = Regex(r"#.*")
filter_comments = filter_comments.suppress()
qs = QuotedString('"') | QuotedString("'")
filter_comments = filter_comments.ignore(qs)

这里还有一些测试。

def test_filter_comment():
    teststrings = [
        '# this is comment', 'Option "sadsadlsad#this is not a comment"',
        "Option 'sadsadlsad#this is not a comment'",
        "Option 'sadsadlsad'#this is a comment"
    ]
    expected = [
        '', 'Option "sadsadlsad#this is not a comment"',
        "Option 'sadsadlsad#this is not a comment'",
        "Option 'sadsadlsad'"
    ]

    for i, teststring in enumerate(teststrings):
        result = filter_comments.transformString(teststring)
        assert result == expected[i]

正确压制评论

Correctly suppress comments

pyparsing