在 PyYAML 的构造函数中接受锚点(别名)

Accept anchor (alias) in constructor in PyYAML

我需要为标签创建自定义构造函数。该标签应接受列表以及列表的锚点。

例如,我想如何使用我的标签:

original: &value [1, 2, 3]
processed: !mytag *value

所以我为 !mytag 创建了一个基本构造函数,其中 return 是输入序列:

import yaml

def my_constructor(loader, node):
     return loader.construct_sequence(node)

yaml.Loader.add_constructor('!mytag', my_constructor)

但是当我尝试加载上面的 YAML 源时,出现错误:

>>> source = '''original: &value [1, 2, 3]
processed: !mytag *value'''

>>> yaml.load(source, yaml.Loader)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in t
  File "/usr/local/lib/python3.7/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/usr/local/lib/python3.7/site-packages/yaml/constructor.py", line 41, in get_single_data
    node = self.get_single_node()
  File "/usr/local/lib/python3.7/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/usr/local/lib/python3.7/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/usr/local/lib/python3.7/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/usr/local/lib/python3.7/site-packages/yaml/composer.py", line 127, in compose_mapping_node
    while not self.check_event(MappingEndEvent):
  File "/usr/local/lib/python3.7/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/usr/local/lib/python3.7/site-packages/yaml/parser.py", line 439, in parse_block_mapping_key
    "expected <block end>, but found %r" % token.id, token.start_mark)
yaml.parser.ParserError: while parsing a block mapping
  in "test.yml", line 1, column 1
expected <block end>, but found '<alias>'
  in "test.yml", line 2, column 19

神奇的是,如果我用方括号括起锚引用,它就会起作用:

>>> source = '''original: &value [1, 2, 3]
processed: !mytag [*value]'''

>>> yaml.load(source, yaml.Loader)
{'original': [1, 2, 3], 'processed': [[1, 2, 3]]}

但这不是我想要的,我需要将原始列表传递给构造函数,而不是双列表。

更新:双列表也不起作用。即使我 return 它在结果中显示为原始列表,但如果尝试从构造函数访问它,它在那个阶段只是一个空列表:

>>> source = '''original: &value [1, 2, 3]
... processed: !mytag [*value]'''
>>>
>>> def my_constructor(loader, node):
...     print(loader.construct_sequence(node))
...     return loader.construct_sequence(node)
...
>>> yaml.Loader.add_constructor('!mytag', my_constructor)
>>>
>>> yaml.load(source, yaml.Loader)
[[]]  # <--- this is the printed value
{'original': [1, 2, 3], 'processed': [[1, 2, 3]]}  # <--- this is the returned value

有人知道怎么做吗?

Python 3.7.6 PyYAML 5.3

附加序列正是你想要的。

请注意,YAML 标签描述了节点的 type,而不是处理指令。别名指的是现有节点,这些节点已经具有类型 ,即使 它们没有显式标记(例如,您的原始序列将在 YAML 核心模式下被标记为 !!seq ).

现在,如果您希望标签的语义是“获取现有节点,并以某种方式对其进行转换”,它描述了一个函数调用。函数调用本身就是一个结构,仅将其输入作为参数引用。因此,要正确建模,您需要将参数放在一个结构中,而序列是最简单的方法。您也可以使用映射来实现:

original: &value [1, 2, 3]
processed: !mytag {input: *value}

但这更冗长。

在您的构造函数中,然后从周围结构中提取参数并对其进行处理。

编辑: 下面是访问引用列表的概念验证。我不确定为什么您必须手动导航到外部节点,这可能是 PyYAML 错误。

import yaml

source = '''original: &value [1, 2, 3]
processed: !mytag [*value]'''

def my_constructor(loader, node):
  assert isinstance(node, yaml.SequenceNode)
  param = loader.construct_sequence(node.value[0], deep=True)
  print(param)
  # do something with the param here
  return param

yaml.Loader.add_constructor('!mytag', my_constructor)

print(yaml.load(source, yaml.Loader))

输出:

[1, 2, 3]
{'original': [1, 2, 3], 'processed': [1, 2, 3]}