如何格式化 YAML 转储中的字符串?

How to format a string in YAML dump?

使用 ruamel.yaml 转储 多行 字符串结果如下:

address_pattern_template: "\n^                           #the beginning of the address\
  \ string (e.g. interface number)\n(?P<junkbefore>             #capturing the junk\
  \ before the address\n    \D?                     #an optional non-digit character\n\
  \    .*?                     #any characters (non-greedy) up to the address\n)\n\
  (?P<address>                #capturing the pure address\n    {pure_address_pattern}\n\
  )\n(?P<junkafter>              #capturing the junk after the address\n    \D? \
  \                    #an optional non-digit character\n    .*                  \
  \    #any characters (greedy) up to the end of the string\n)\n$                \
  \           #the end of the input address string\n"


from ruamel.yaml import YAML
data =dict(
^                           #the beginning of the address string (e.g. interface number)
(?P<junkbefore>             #capturing the junk before the address
    \D?                     #an optional non-digit character
    .*?                     #any characters (non-greedy) up to the address
(?P<address>                #capturing the pure address
(?P<junkafter>              #capturing the junk after the address
    \D?                     #an optional non-digit character
    .*                      #any characters (greedy) up to the end of the string
$                           #the end of the input address string
yaml = YAML(typ='safe', pure=True)
yaml.default_flow_style = False
with open('D:\datadump.yml', 'w') as dumpfile:
    yaml.dump(data, dumpfile)

我想以可读的格式查看多行字符串。 IE。换行符将换行而不是显示为“\n”。


address_pattern_template: |
  ^                           #the beginning of the address string (e.g. interface number)
  (?P<junkbefore>             #capturing the junk before the address
      \D?                     #an optional non-digit character
      .*?                     #any characters (non-greedy) up to the address
  (?P<address>                #capturing the pure address
  (?P<junkafter>              #capturing the junk after the address
      \D?                     #an optional non-digit character
      .*                      #any characters (greedy) up to the end of the string
  $                           #the end of the input address string

注意,我的程序记录了一个大字典,这样的多行字符串可以出现在字典结构的任何地方和任何深处。因此,遍历 dict 树并在转储之前加载它们中的每一个(如 "Can I control the formatting of multiline strings?" 中所建议)对我来说不是一个好的解决方案。


首先,你呈现的就是你希望得到的输出, 不代表您提供的数据。自从 该数据中的多行字符串以换行符开头,块 样式文字标量需要块缩进指示符和开头的换行符:

address_pattern_template: |2

  ^                           #the beginning of the address string (e.g. interface number)

但是(至少对我而言)这些模式没有意义 从一个换行符开始,所以我将在下面省略它。

如果您不知道多行字符串在您的数据结构中的什么位置,但如果您知道 在转储之前就地转换它,而不是你可以使用 ruamel.yaml.scalarstring:walk_tree

import sys
import ruamel.yaml

data = dict(a=[1, 2, 3, dict(
^                           #the beginning of the address string (e.g. interface number)
(?P<junkbefore>             #capturing the junk before the address
    \D?                     #an optional non-digit character
    .*?                     #any characters (non-greedy) up to the address
(?P<address>                #capturing the pure address
(?P<junkafter>              #capturing the junk after the address
    \D?                     #an optional non-digit character
    .*                      #any characters (greedy) up to the end of the string
$                           #the end of the input address string

yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)


- 1
- 2
- 3
- address_pattern_template: |
    ^                           #the beginning of the address string (e.g. interface number)
    (?P<junkbefore>             #capturing the junk before the address
        \D?                     #an optional non-digit character
        .*?                     #any characters (non-greedy) up to the address
    (?P<address>                #capturing the pure address
    (?P<junkafter>              #capturing the junk after the address
        \D?                     #an optional non-digit character
        .*                      #any characters (greedy) up to the end of the string
    $                           #the end of the input address string

walk_tree 将多行字符串替换为 LiteralScalarString,在大多数情况下表现得像普通 字符串.

如果就地转换不可接受,您可以对 数据,然后在副本上应用 walk_tree。如果那不是可以接受的 由于内存限制,您必须为字符串提供替代表示 如果您有多行字符串,则在表示期间检查。最好你这样做 在 Representer 的子类中:

import sys
import ruamel.yaml

# data defined as before

class MyRepresenter(ruamel.yaml.representer.RoundTripRepresenter):
    def represent_str(self, data):
        style = '|' if '\n' in data else None
        return self.represent_scalar(u'tag:yaml.org,2002:str', data, style=style)

MyRepresenter.add_representer(str, MyRepresenter.represent_str)

yaml = ruamel.yaml.YAML()
yaml.Representer = MyRepresenter
yaml.dump(data, sys.stdout)
