在 python 中读取 yaml 文件时如何跳过行？

Question

我熟悉类似的问题，但它们似乎没有解决本应是简单的问题。我正在使用 Python 2.7x 并尝试读取类似于以下内容的 YAML 文件：

%YAML:1.0
radarData: !!opencv-matrix
rows: 5
cols: 2
dt: u
data: [0, 0, 0, 0, 0, 10, 5, 3, 1, 22]

现在我只需要 'data:' 文档。我尝试了一种普通的方法，然后尝试强制跳过前 4 行（第二个代码片段被注释掉）。两种方法都给出了错误。

import yaml
stream = file('test_0x.yml', 'r') 
yaml.load(stream)
# alternative code snippet
# with open('test_0x.yml') as f:
#  stream = f.readlines()[4:]
# yaml.load(stream)

任何有关如何跳过前几行的建议都将不胜感激。

Answer 1

我完全没有抓住要点，但我把我原来的答案留在底部作为一个谦卑的提醒。

mhawke 的回答简短而亲切，可能更可取。一个更复杂的解决方案：删除格式错误的指令，更正您的自定义标记，并为其添加一个构造函数。这样做的好处是可以更正该标记出现在文件中的任何位置，而不仅仅是前几行。

我在这里的实现确实有一些缺点 - 它吞噬了整个文件，并且没有在复杂数据上进行过测试，其中用适当的标签替换标签的效果可能与预期的结果不同。

import yaml

def strip_malformed_directive(yaml_file):
    """
    Strip a malformed YAML directive from the top of a file.

    Returns the slurped (!) file.
    """
    lines = list(yaml_file)
    first_line = lines[0]
    if first_line.startswith('%') and ":" in first_line:
       return "\n".join(lines[1:])
    else:
       return "\n".join(lines)


def convert_opencvmatrix_tag(yaml_events):
    """
    Convert an erroneous custom tag, !!opencv-matrix, to the correct 
    !opencv-matrix, in a stream of YAML events.
    """
    for event in yaml_events:
        if hasattr(event, "tag") and event.tag == u"tag:yaml.org,2002:opencv-matrix":
            event.tag = u"!opencv-matrix"
        yield event


yaml.add_constructor("!opencv-matrix", lambda loader, node: None)
with open("test_0x.yml") as yaml_file:
    directive_processed = strip_malformed_directive(yaml_file)
    yaml_events = yaml.parse(directive_processed)
    matrix_tag_converted = convert_opencvmatrix_tag(yaml_events)
    fixed_document = yaml.emit(matrix_tag_converted)

    data = yaml.load(fixed_document)
    print data

原答案

您正在使用的 yaml.load 函数 returns 字典，可以像这样访问：

import yaml

with open("test_0x.yml") as yaml_file:
    test_data = yaml.load(yaml_file)

print test_data["data"]

有帮助吗？

Answer 2

实际上，您只需要跳过前两行。

import yaml

skip_lines = 2
with open('test_0x.yml') as infile:
    for i in range(skip_lines):
        _ = infile.readline()
    data = yaml.load(infile)

>>> data
{'dt': 'u', 'rows': 5, 'data': [0, 0, 0, 0, 0, 10, 5, 3, 1, 22], 'cols': 2}
>>> data['data']
[0, 0, 0, 0, 0, 10, 5, 3, 1, 22]

跳过前 5 行也有效。

Answer 3

我有aruco_calibration_fromimages.exe生成的相机矩阵，这里是yml文件：

%YAML:1.0
---
image_width: 4000
image_height: 3000
camera_matrix: !!opencv-matrix
   rows: 3
   cols: 3
   dt: d
   data: [ 3.1943912478853654e+03, 0., 1.9850941722590378e+03, 0.,
       3.2021356095317910e+03, 1.5509955246019449e+03, 0., 0., 1. ]
distortion_coefficients: !!opencv-matrix
   rows: 1
   cols: 5
   dt: d
   data: [ 1.3952810090687282e-01, -3.8313647492178071e-01,
       5.0555840762660396e-03, 2.3753464602670597e-03,
       3.3952514744179502e-01 ]

使用以下代码加载此 yml：

import cv2
fs = cv2.FileStorage("./calib_asus_chess/cam_calib_asus.yml", cv2.FILE_STORAGE_READ)
fn = fs.getNode("camera_matrix")
print(fn.mat())

得到这个结果：

[[  3.19439125e+03   0.00000000e+00   1.98509417e+03]
 [  0.00000000e+00   3.20213561e+03   1.55099552e+03]
 [  0.00000000e+00   0.00000000e+00   1.00000000e+00]]

在 python 中读取 yaml 文件时如何跳过行？

How to skip lines when reading a yaml file in python?

python

yaml