在 YAML 中，带引号的标量必须由解析器解释为字符串吗？

Question

我在 Internet 上看到了一些建议，如果您希望将 YAML 标量值作为字符串处理，您应该引用它：

foo : "2018-04-17"

在上面的示例中，此建议旨在告诉我值 2018-04-17 将由任何给定的 YAML 解析器作为其本地语言的字符串类型进行处理。例如，如果此建议为真，SnakeYAML 会将其解释为 java.lang.String，而不是 java.util.Date。（碰巧，SnakeYAML 将其解释为 java.util.Date，引号与否，这就是我问这个问题的原因。）

但是尽管这个建议可能发生与任何 given 解析器一起工作，但我看不到 YAML 1.2. specification this advice might come from. The closest thing I can find is the following sentence:

YAML allows scalars to be presented in several formats. For example, the integer “11” might also be written as “0xB”. Tags must specify a mechanism for converting the formatted content to a canonical form for use in equality testing. Like node style, the format is a presentation detail and is not reflected in the serialization tree and representation graph.

和this one：

The scalar style is a presentation detail and must not be used to convey content information, with the exception that plain scalars are distinguished for the purpose of tag resolution.

和this one：

Note that resolution must not consider presentation details such as comments, indentation and node style.

尽管如此，我看到很多 YAML 文档依赖于双引号值意味着它将被解析为字符串的建议，这让我觉得我误读了某物。这个话题有争论吗？

Answer 1

YAML 1.1 规范的相关部分（请注意，SnakeYaml 是 YAML 1.1，因此，1.2 规范不一定适用）：

It is not required that all the tags of the complete representation be explicitly specified in the character stream. During parsing, nodes that omit the tag are given a non-specific tag: “?” for plain scalars and “!” for all other nodes. [...]

It is recommended that nodes having the “!” non-specific tag should be resolved as “tag:yaml.org,2002:seq”, “tag:yaml.org,2002:map” or “tag:yaml.org,2002:str” depending on the node’s kind. This convention allows the author of a YAML character stream to exert some measure of control over the tag resolution process. By explicitly specifying a plain scalar has the “!” non-specific tag, the node is resolved as a string, as if it was quoted or written in a block style. Note, however, that each application may override this behavior. For example, an application may automatically detect the type of programming language used in source code presented as a non-plain scalar and resolve it accordingly.

总而言之，不需要 YAML 处理器来将带引号的标量解析为字符串，并且 YAML 也没有规定 tag:yaml.org,2002:str 映射到哪个本机类型.事实上，大多数 YAML 实现只遵循该建议的一部分。例如，如果您使用 SnakeYaml 将 YAML 反序列化为 POJO/JavaBean，您通常不会在 YAML 中使用任何显式标记，但您的映射会解析为对应的 Java classes root class' 结构，而不是此建议所建议的通用 Map（因为所有没有显式标签的映射都会获得 ! 非特定标签）。

请注意，这已在 YAML 1.2 中更改：

During parsing, nodes lacking an explicit tag are given a non-specific tag: “!” for non-plain scalars, and “?” for all other nodes.

这更接近大多数实现，但是例如，如果您反序列化为 class class Foo { String bar; }，尽管 bar 不是字符串，而是字段名称，但仍会加载:

"bar": some value

所以使用 YAML 的建议是在应用程序端指定所需的结构——在 SnakeYaml 中，您将设置根 class 类型，然后每个值将被映射到其所需的类型层次结构中的点，只要它能够映射到那里，无论它是被引用还是不被引用。通常，应用程序指定它在整个层次结构中期望哪种值比 YAML 作者通过引用来指定更有意义。这也符合 YAML 规范，它说

Resolving the tag of a node must only depend on the following three parameters: (1) the non-specific tag of the node, (2) the path leading from the root to the node, and (3) the content (and hence the kind) of the node.

解析标签 是用于确定目标类型的 YAML 术语。并且允许根据其在层次结构中的位置确定目标类型：根类型由元素是 YAML 文档的根这一事实决定，在 SnakeYaml 的情况下，可以通过 API。所有其他类型都由它们是根类型的后代这一事实决定。

最后说明：如果您真的希望某个东西成为字符串，!!str 2018-04-17 就可以了，因为它为节点设置了特定的标签。

在 YAML 中，带引号的标量必须由解析器解释为字符串吗？

In YAML, must a quoted scalar be interpreted by a parser as a string?

yaml

snakeyaml