如何从 PyYAML 异常中获取详细信息?

How to get details from PyYAML exception?

我想优雅地通知用户他们搞砸的 YAML 文件的确切位置。 python-3.4.1/lib/python-3.4/yaml/scanner.py的第288行报了一个常见的解析错误,并通过抛出异常来处理:

raise ScannerError("while scanning a simple key", key.mark,
                   "could not found expected ':'", self.get_mark())

纠结怎么举报

try:
    parsed_yaml = yaml.safe_load(txt)

except yaml.YAMLError as exc:
    print ("scanner error 1")
    if hasattr(exc, 'problem_mark'):
        mark = exc.problem_mark
        print("Error parsing Yaml file at line %s, column %s." %
                                            (mark.line, mark.column+1))
    else:
        print ("Something went wrong while parsing yaml file")
    return

这给出了

$ yaml_parse.py
scanner error 1
Error parsing Yaml file line 1508, column 9.

但是我如何获取错误文本以及 key.mark 和其他标记中的内容?

更有用的是,我如何检查 PyYaml 源来解决这个问题? ScannerError class 似乎忽略了参数(来自 scanner.py 第 32 行):

class ScannerError(MarkedYAMLError):
     pass

ScannerError class 没有定义任何方法(pass 语句像空​​操作一样工作。这使得它在功能上与其基础 class 相同MarkedYAMLError 也就是存储数据的那个人。来自error.py:

class MarkedYAMLError(YAMLError):
    def __init__(self, context=None, context_mark=None,
                 problem=None, problem_mark=None, note=None):
        self.context = context
        self.context_mark = context_mark
        self.problem = problem
        self.problem_mark = problem_mark
        self.note = note

    def __str__(self):
        lines = []
        if self.context is not None:
            lines.append(self.context)
        if self.context_mark is not None  \
           and (self.problem is None or self.problem_mark is None
                or self.context_mark.name != self.problem_mark.name
                or self.context_mark.line != self.problem_mark.line
                or self.context_mark.column != self.problem_mark.column):
            lines.append(str(self.context_mark))
        if self.problem is not None:
            lines.append(self.problem)
        if self.problem_mark is not None:
            lines.append(str(self.problem_mark))
        if self.note is not None:
            lines.append(self.note)
        return '\n'.join(lines)

如果您从文件开始 txt.yaml:

hallo: 1
bye

和一个test.py:

import ruamel.yaml as yaml
txt = open('txt.yaml')
data = yaml.load(txt, yaml.SafeLoader)

你会得到描述性不强的错误:

...
ruamel.yaml.scanner.ScannerError: while scanning a simple key
  in "txt.yaml", line 2, column 1
could not find expected ':'
  in "txt.yaml", line 3, column 1

然而,如果你改变 test.py 的第二行:

import ruamel.yaml as yaml
txt = open('txt.yaml').read()
data = yaml.load(txt, yaml.SafeLoader)

你得到更有趣的错误描述:

...
ruamel.yaml.scanner.ScannerError: while scanning a simple key
  in "<byte string>", line 2, column 1:
    bye
    ^
could not find expected ':'
  in "<byte string>", line 3, column 1:

    ^

这种差异是因为 get_mark()(在 reader.py 中)如果不处理流,则有更多的上下文指向:

def get_mark(self):
    if self.stream is None:
        return Mark(self.name, self.index, self.line, self.column,
                    self.buffer, self.pointer)
    else:
        return Mark(self.name, self.index, self.line, self.column,
                    None, None)

此数据进入 context_mark 属性。当您想为错误提供更多上下文时,请查看它。 但如上所示,仅当您从缓冲区而不是流解析 YAML 输入时才有效

搜索 YAML 源是一项艰巨的任务,各种 classes 的所有方法都附加到它们作为父 classes 的 Loader 或 Dumper。最好的帮助 要跟踪这是在 def method_name( 上使用 grep,因为至少方法名称都是不同的(因为它们必须如此才能起作用)。


在上面我使用了我的增强版 PyYAML,称为 ruamel.yaml,出于这个答案的目的,它们应该工作相同。

根据@Anthon 的回答,这段代码运行良好:

try:
    import yaml
except:
    print ('Fatal error:  Yaml library not available')
    quit()

f = open ('y.yml')
txt = f.read()

try:
    yml = yaml.load(txt, yaml.SafeLoader)

except yaml.YAMLError as exc:
    print ("Error while parsing YAML file:")
    if hasattr(exc, 'problem_mark'):
        if exc.context != None:
            print ('  parser says\n' + str(exc.problem_mark) + '\n  ' +
                str(exc.problem) + ' ' + str(exc.context) +
                '\nPlease correct data and retry.')
        else:
            print ('  parser says\n' + str(exc.problem_mark) + '\n  ' +
                str(exc.problem) + '\nPlease correct data and retry.')
    else:
        print ("Something went wrong while parsing yaml file")
    return

# make use of `yml`

具有轻微破坏数据的示例输出:

$ yaml_parse.py
Error while parsing YAML file:
  parser says
  in "<unicode string>", line 1525, column 9:
      - name: Curve 1
            ^
  could not found expected ':' while scanning a simple key
Please correct data and retry.

$ yaml_parse.py
Error while parsing YAML file:
  parser says
  in "<unicode string>", line 1526, column 10:
        curve: title 1
             ^
  mapping values are not allowed here
Please correct data and retry.