如何忽略 pyparsing ParseException 并继续？

Question

我想忽略文件中与所有预定义解析器不匹配的行并继续。我想忽略的行范围很广，我无法检查并为它们中的每一个定义解析器。

一旦捕获到 ParseException，我就使用 try..except 和 pass。但是，解析会立即停止。

try:
    return parser.parseFile(filename, parse_all)

except ParseException, err:
    msg = 'Error during parsing of {}, line {}'.format(filename, err.lineno)
    msg += '\n' + '-'*70 + '\n'
    msg += err.line + '\n'
    msg += ' '*(err.col-1) + '^\n'
    msg += '-'*70 + '\n' + err.msg
    err.msg = msg

    print(err.msg)
    pass

即使出现 ParseException 我也愿意继续。

Answer 1

Pyparsing 并没有真正的 "continue on error" 选项，因此您需要调整您的解析器，使其不会首先引发 ParseException。您可能会尝试将 | SkipTo(LineEnd())('errors*') 之类的内容添加到您的解析器中，作为最后一搏。然后您可以查看错误结果名称以查看哪些行出错（或向该表达式添加解析操作以捕获的不仅仅是当前行）。

import pyparsing as pp

era = "The" + pp.oneOf("Age Years") + "of" + pp.Word(pp.alphas)

era.runTests("""
    The Age of Enlightenment
    The Years of Darkness
    The Spanish Inquisition
    """)

打印：

The Age of Enlightenment
['The', 'Age', 'of', 'Enlightenment']

The Years of Darkness
['The', 'Years', 'of', 'Darkness']

The Spanish Inquisition
    ^
FAIL: Expected Age | Years (at char 4), (line:1, col:5)

添加这些行并再次调用 runTests：

# added to handle lines that don't match
unexpected = pp.SkipTo(pp.LineEnd(), include=True)("no_one_expects")
era = era | unexpected

打印：

The Age of Enlightenment
['The', 'Age', 'of', 'Enlightenment']

The Years of Darkness
['The', 'Years', 'of', 'Darkness']

The Spanish Inquisition
['The Spanish Inquisition']
 - no_one_expects: 'The Spanish Inquisition'

如何忽略 pyparsing ParseException 并继续？

How to ignore pyparsing ParseException and proceed?

python

text-processing

pyparsing