在 ruamel.yaml 的迭代期间获得评论

get comment during iteration in ruamel.yaml

如何在遍历 YAML 对象时获取注释

yaml = YAML()

with open(path, 'r') as f:
    yaml_data = yaml.load(f)

for obj in yaml_data:
    # how to get the comments here?

这是源数据(ansible playbook)

---
- name: gather all complex custom facts using the custom module
  hosts: switches
  gather_facts: False
  connection: local
  tasks:
    # There is a bug in ansible 2.4.1 which prevents it loading
    # playbook/group_vars
    - name: ensure we're running a known working version
      assert:
        that:
          - 'ansible_version.major == 2'
          - 'ansible_version.minor == 4'

经过Anthon评论,我发现访问子节点评论的方式是这样的(有待完善):

for idx, obj in enumerate(yaml_data):
    for i, item in enumerate(obj.items()):
        pprint(yaml_data[i].ca.items)

您没有指定输入,但由于您的代码需要一个 obj 和 不是键,我假设你的 YAML 的根级别是一个序列而不是映射。 如果您想在每个元素(即 nr 1the last)之后获得评论,您可以这样做:

import ruamel.yaml

yaml_str = """\
- one  # nr 1
- two 
- three # the last
"""

yaml = ruamel.yaml.YAML()

data = yaml.load(yaml_str)

for idx, obj in enumerate(data):
    comment_token = data.ca.items.get(idx)
    if comment_token is None:
        continue
    print(repr(comment_token[0].value))

给出:

'# nr 1\n'
'# the last\n'

您可能想要去掉开头的 octothorpe 和结尾的换行符。

请注意,这适用于当前版本 (0.15.61),但是 不能保证它不会改变。

使用 as well as an issue in ruamel.yaml on sourceforge,这里有一组方法可以让您检索(几乎 - 见下文)文档中的所有评论:

from ruamel.yaml import YAML
from ruamel.yaml.comments import CommentedMap, CommentedSeq

# set attributes
def get_comments_map(self, key):
    coms = []
    comments = self.ca.items.get(key)
    if comments is None:
        return coms
    for token in comments:
        if token is None:
            continue
        elif isinstance(token, list):
            coms.extend(token)
        else:
            coms.append(token)
    return coms

def get_comments_seq(self, idx):
    coms = []
    comments = self.ca.items.get(idx)
    if comments is None:
        return coms
    for token in comments:
        if token is None:
            continue
        elif isinstance(token, list):
            coms.extend(token)
        else:
            coms.append(token)
    return coms

setattr(CommentedMap, 'get_comments', get_comments_map)
setattr(CommentedSeq, 'get_comments', get_comments_seq)

# load string
yaml_str = """\
- name: gather all complex custom facts using the custom module
  hosts: switches
  gather_facts: False
  connection: local
  tasks:
    # There is a bug in ansible 2.4.1 which prevents it loading
    # playbook/group_vars
    - name: ensure we're running a known working version
      assert:
        that:
          - 'ansible_version.major == 2'
          - 'ansible_version.minor == 4'
"""
yml = YAML(typ='rt')
data = yml.load(yaml_str)

def walk_data(data):
    if isinstance(data, CommentedMap):
        for k, v in data.items():
            print(k, [ comment.value for comment in data.get_comments(k)])
            if isinstance(v, CommentedMap) or isinstance(v, CommentedSeq):
                walk_data(v)
    elif isinstance(data, CommentedSeq):
        for idx, item in enumerate(data):
            print(idx, [ comment.value for comment in data.get_comments(idx)])
            if isinstance(item, CommentedMap) or isinstance(item, CommentedSeq):
                walk_data(item)

walk_data(data)

这是输出:

0 []
name []
hosts []
gather_facts []
connection []
tasks ['# There is a bug in ansible 2.4.1 which prevents it loading\n', '# playbook/group_vars\n']
0 []
name []
assert []
that []
0 []
1 []

不幸的是,有两个是我遇到的一个问题s,但此方法未涵盖:

  1. 您会注意到 tasks 的注释中没有前导 \n。因此,使用此方法无法区分在 tasks 同一行或下一行开始的注释。由于 CommentToken.start_mark.line 仅包含注释的绝对行,因此可以与 tasks 行进行比较。但是,我还没有找到一种方法来检索加载数据中与 tasks 关联的行。
  2. 我似乎还没有找到一种方法来检索文档开头的注释。因此,任何初始评论都需要使用一种方法来发现,而不是在 yaml reader 之外检索它们。但是,与问题 #1 相关,这些头部评论包含在其他评论的绝对行数中。 要在文档头部添加评论,您需要使用 [comment.value for comment in data.ca.comment[1] 根据 .