Pyyaml - 对键、整数和字符串使用不同的样式

Pyyaml - Using different styles for keys and integers and strings

--- 
"main": 
  "directory": 
    "options": 
      "directive": 'options'
      "item": 
        "options": 'Stuff OtherStuff MoreStuff'
  "directoryindex": 
    "item": 
      "directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
  "fileetag": 
    "item": 
      "fileetag": 'Stuff'
  "keepalive": 
    "item": 
      "keepalive": 'Stuff'
  "keepalivetimeout": 
    "item": 
      "keepalivetimeout": 2

以上是我需要解析、编辑然后转储的 YAML 文件。我选择在 python 2.7 上使用 pyyaml(我需要使用它)。 我已经能够解析和编辑。

但是,由于 YAML 具有不同的键样式以及字符串和整数的不同样式,因此我无法设置默认样式。我现在想知道如何使用 pyyaml 为不同类型转储不同样式。

下面是我解析和编辑的内容

infile = yaml.load(open('yamlfile'))

#Recursive function to loop through nested dictionary
def edit(d,keytoedit=None,newvalue=None):
  for key, value in d.iteritems():
    if isinstance(value, dict) and key == keytoedit and 'item' in value:
      value[value.iterkeys().next()] = {keytoedit:newvalue}
      edit(value,keytoedit=keytoedit,newvalue=newvalue)
    elif isinstance(value, dict) and keytoedit in value and 'item' not in value and key != 'main':
      value[keytoedit] = newvalue
      edit(value,keytoedit=keytoedit,newvalue=newvalue)
    elif isinstance(value, dict):
      edit(value,keytoedit=keytoedit,newvalue=newvalue)

outfile = file('outfile','w')
yaml.dump(infile, outfile,default_flow_style=False)

所以,我想知道如何实现这一点,如果我在 yaml.dump 中使用 default_style,所有类型都获得相同的样式,我需要遵守原始 YAML 文件标准。

我能否以某种方式使用 pyyaml 为特定类型指定样式?

编辑: 这是我到目前为止得到的,缺少的部分是键上的双 qoutes 和字符串上的单 qoutes。

main:
  directory:
    options:
      directive: options
      item:
        options: Stuff OtherStuff MoreStuff
  directoryindex:
    item:
      directoryindex: stuff.html otherstuff.htm morestuff.html
  fileetag:
    item:
      fileetag: Stuff
  keepalive:
    item:
      keepalive: 'On'
  keepalivetimeout:
    item:
      keepalivetimeout: 2

对于 "normal" 的某些值,您至少可以为具有正常 yaml.dump() 的各种元素保留原始 flow/block 样式。

你需要的是一个在读取数据时保存 flow/bcock 样式信息的加载器,subclass 具有样式的普通类型 (mappings/dicts resp. sequences/lists) 以便它们的行为类似于加载程序通常返回的 python 构造,但附加了样式信息。然后在使用 yaml.dump 的途中,您提供了一个将此样式信息考虑在内的自定义转储程序。

我在名为 ruamel.yaml 的增强版 PyYAML 中使用普通的 yaml.dump,但有特殊的加载程序和转储程序 class RoundTripDumper(以及 RoundTripLoader 对于 yaml.load) 保留 flow/block 样式(以及您在文件中可能有的任何注释:

import ruamel.yaml as yaml

infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)

for key, value in infile['main'].items():
    if key == 'keepalivetimeout':
        item = value['item']
        item['keepalivetimeout'] = 400

print yaml.dump(infile, Dumper=yaml.RoundTripDumper)

给你:

main:
  directory:
    options:
      directive: options
      item:
        options: Stuff OtherStuff MoreStuff
  directoryindex:
    item:
      directoryindex: stuff.htm otherstuff.htm morestuff.html
  fileetag:
    item:
      fileetag: Stuff
  keepalive:
    item:
      keepalive: Stuff
  keepalivetimeout:
    item:
      keepalivetimeout: 400

如果您无法安装 ruamel.yaml,您可以从 my repository 中提取代码并将其包含在您的代码中,AFAIK PyYAML 自从我开始从事此工作以来尚未升级。

我目前不保留标量上多余的引号,但我确实保留了咀嚼信息(对于以“|”开头的多行语句。在 YAML 的输入处理中,该信息很早就被丢弃了文件,并且需要保留多个更改。

由于您似乎对键和值字符串标量有不同的引号,您可以通过覆盖 process_scalar(emitter.py 中发射器的一部分)添加引号来实现您想要的输出基于字符串标量是否为键以及是否为整数:

import ruamel.yaml as yaml

# the scalar emitter from emitter.py
def process_scalar(self):
    if self.analysis is None:
        self.analysis = self.analyze_scalar(self.event.value)
    if self.style is None:
        self.style = self.choose_scalar_style()
    split = (not self.simple_key_context)
    # VVVVVVVVVVVVVVVVVVVV added
    try:
        x = int(self.event.value)  # might need to expand this
    except:
        # we have string
        if split:
            self.style = "'"
        else:
            self.style = '"'
    # ^^^^^^^^^^^^^^^^^^^^
    # if self.analysis.multiline and split    \
    #         and (not self.style or self.style in '\'\"'):
    #     self.write_indent()
    if self.style == '"':
        self.write_double_quoted(self.analysis.scalar, split)
    elif self.style == '\'':
        self.write_single_quoted(self.analysis.scalar, split)
    elif self.style == '>':
        self.write_folded(self.analysis.scalar)
    elif self.style == '|':
        self.write_literal(self.analysis.scalar)
    else:
        self.write_plain(self.analysis.scalar, split)
    self.analysis = None
    self.style = None
    if self.event.comment:
        self.write_post_comment(self.event)


infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)

for key, value in infile['main'].items():
    if key == 'keepalivetimeout':
        item = value['item']
        item['keepalivetimeout'] = 400

dd = yaml.RoundTripDumper
dd.process_scalar = process_scalar

print '---'
print yaml.dump(infile, Dumper=dd)

给你:

---
"main":
  "directory":
    "options":
      "directive": 'options'
      "item":
        "options": 'Stuff OtherStuff MoreStuff'
  "directoryindex":
    "item":
      "directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
  "fileetag":
    "item":
      "fileetag": 'Stuff'
  "keepalive":
    "item":
      "keepalive": 'Stuff'
  "keepalivetimeout":
    "item":
      "keepalivetimeout": 400

这与您的要求非常接近。

改用ruamel.yaml

它的文档比 pyyaml 更好:https://pypi.org/project/ruamel.yaml/

我要阅读的template.yaml文件示例:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  lambda_explicit_matchning

  Sample SAM Template for lambda_explicit_matchning

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 900

Resources:
  ExplicitAlgoFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction

    Properties:
      MemorySize: 3008

如您在我的示例中所见,我们对字符串使用引号,对整数没有引号。

然后加载并解析那个yaml文件,就这么简单(不用担心样式)

    from ruamel.yaml import YAML
    yaml = YAML()
    file = open("template.yaml", 'r')
    sam_yaml = file.read()
    sam_yaml = yaml.load(sam_yaml)

ruamel 库可以读取yaml 文件而不用担心样式。就这么简单:D