在 YAML 中,带引号的标量必须由解析器解释为字符串吗?
In YAML, must a quoted scalar be interpreted by a parser as a string?
我在 Internet 上看到了一些建议,如果您希望将 YAML 标量值作为字符串处理,您应该引用它:
foo : "2018-04-17"
在上面的示例中,此建议旨在告诉我值 2018-04-17
将由任何给定的 YAML 解析器作为其本地语言的字符串类型进行处理。例如,如果此建议为真,SnakeYAML 会将其解释为 java.lang.String
,而不是 java.util.Date
。 (碰巧,SnakeYAML 将其解释为 java.util.Date
,引号与否,这就是我问这个问题的原因。)
但是尽管这个建议可能 发生 与任何 given 解析器一起工作,但我看不到 YAML 1.2. specification this advice might come from. The closest thing I can find is the following sentence:
YAML allows scalars to be presented in several formats. For example, the integer “11
” might also be written as “0xB
”. Tags must specify a mechanism for converting the formatted content to a canonical form for use in equality testing. Like node style, the format is a presentation detail and is not reflected in the serialization tree and representation graph.
和this one:
The scalar style is a presentation detail and must not be used to convey content information, with the exception that plain scalars are distinguished for the purpose of tag resolution.
和this one:
Note that resolution must not consider presentation details such as comments, indentation and node style.
尽管如此,我看到很多 YAML 文档依赖于双引号值意味着它将被解析为字符串的建议,这让我觉得我误读了某物。这个话题有争论吗?
YAML 1.1 规范的相关部分(请注意,SnakeYaml 是 YAML 1.1,因此,1.2 规范不一定适用):
It is not required that all the tags of the complete representation be explicitly specified in the character stream. During parsing, nodes that omit the tag are given a non-specific tag: “?” for plain scalars and “!” for all other nodes. [...]
It is recommended that nodes having the “!” non-specific tag should be resolved as “tag:yaml.org,2002:seq”, “tag:yaml.org,2002:map” or “tag:yaml.org,2002:str” depending on the node’s kind. This convention allows the author of a YAML character stream to exert some measure of control over the tag resolution process. By explicitly specifying a plain scalar has the “!” non-specific tag, the node is resolved as a string, as if it was quoted or written in a block style. Note, however, that each application may override this behavior. For example, an application may automatically detect the type of programming language used in source code presented as a non-plain scalar and resolve it accordingly.
总而言之,不需要 YAML 处理器来将带引号的标量解析为字符串,并且 YAML 也没有规定 tag:yaml.org,2002:str
映射到哪个本机类型.事实上,大多数 YAML 实现只遵循该建议的一部分。例如,如果您使用 SnakeYaml 将 YAML 反序列化为 POJO/JavaBean,您通常不会在 YAML 中使用任何显式标记,但您的映射会解析为对应的 Java classes root class' 结构,而不是此建议所建议的通用 Map
(因为所有没有显式标签的映射都会获得 !
非特定标签)。
请注意,这已在 YAML 1.2 中更改:
During parsing, nodes lacking an explicit tag are given a non-specific tag: “!” for non-plain scalars, and “?” for all other nodes.
这更接近大多数实现,但是例如,如果您反序列化为 class class Foo { String bar; }
,尽管 bar
不是字符串,而是字段名称,但仍会加载:
"bar": some value
所以使用 YAML 的建议是在应用程序端指定所需的结构——在 SnakeYaml 中,您将设置根 class 类型,然后每个值将被映射到其所需的类型层次结构中的点,只要它能够映射到那里,无论它是被引用还是不被引用。通常,应用程序指定它在整个层次结构中期望哪种值比 YAML 作者通过引用来指定更有意义。这也符合 YAML 规范,它说
Resolving the tag of a node must only depend on the following three parameters: (1) the non-specific tag of the node, (2) the path leading from the root to the node, and (3) the content (and hence the kind) of the node.
解析标签 是用于确定目标类型的 YAML 术语。并且允许根据其在层次结构中的位置确定目标类型:根类型由元素是 YAML 文档的根这一事实决定,在 SnakeYaml 的情况下,可以通过 API。所有其他类型都由它们是根类型的后代这一事实决定。
最后说明:如果您真的希望某个东西成为字符串,!!str 2018-04-17
就可以了,因为它为节点设置了特定的标签。
我在 Internet 上看到了一些建议,如果您希望将 YAML 标量值作为字符串处理,您应该引用它:
foo : "2018-04-17"
在上面的示例中,此建议旨在告诉我值 2018-04-17
将由任何给定的 YAML 解析器作为其本地语言的字符串类型进行处理。例如,如果此建议为真,SnakeYAML 会将其解释为 java.lang.String
,而不是 java.util.Date
。 (碰巧,SnakeYAML 将其解释为 java.util.Date
,引号与否,这就是我问这个问题的原因。)
但是尽管这个建议可能 发生 与任何 given 解析器一起工作,但我看不到 YAML 1.2. specification this advice might come from. The closest thing I can find is the following sentence:
YAML allows scalars to be presented in several formats. For example, the integer “
11
” might also be written as “0xB
”. Tags must specify a mechanism for converting the formatted content to a canonical form for use in equality testing. Like node style, the format is a presentation detail and is not reflected in the serialization tree and representation graph.
和this one:
The scalar style is a presentation detail and must not be used to convey content information, with the exception that plain scalars are distinguished for the purpose of tag resolution.
和this one:
Note that resolution must not consider presentation details such as comments, indentation and node style.
尽管如此,我看到很多 YAML 文档依赖于双引号值意味着它将被解析为字符串的建议,这让我觉得我误读了某物。这个话题有争论吗?
YAML 1.1 规范的相关部分(请注意,SnakeYaml 是 YAML 1.1,因此,1.2 规范不一定适用):
It is not required that all the tags of the complete representation be explicitly specified in the character stream. During parsing, nodes that omit the tag are given a non-specific tag: “?” for plain scalars and “!” for all other nodes. [...]
It is recommended that nodes having the “!” non-specific tag should be resolved as “tag:yaml.org,2002:seq”, “tag:yaml.org,2002:map” or “tag:yaml.org,2002:str” depending on the node’s kind. This convention allows the author of a YAML character stream to exert some measure of control over the tag resolution process. By explicitly specifying a plain scalar has the “!” non-specific tag, the node is resolved as a string, as if it was quoted or written in a block style. Note, however, that each application may override this behavior. For example, an application may automatically detect the type of programming language used in source code presented as a non-plain scalar and resolve it accordingly.
总而言之,不需要 YAML 处理器来将带引号的标量解析为字符串,并且 YAML 也没有规定 tag:yaml.org,2002:str
映射到哪个本机类型.事实上,大多数 YAML 实现只遵循该建议的一部分。例如,如果您使用 SnakeYaml 将 YAML 反序列化为 POJO/JavaBean,您通常不会在 YAML 中使用任何显式标记,但您的映射会解析为对应的 Java classes root class' 结构,而不是此建议所建议的通用 Map
(因为所有没有显式标签的映射都会获得 !
非特定标签)。
请注意,这已在 YAML 1.2 中更改:
During parsing, nodes lacking an explicit tag are given a non-specific tag: “!” for non-plain scalars, and “?” for all other nodes.
这更接近大多数实现,但是例如,如果您反序列化为 class class Foo { String bar; }
,尽管 bar
不是字符串,而是字段名称,但仍会加载:
"bar": some value
所以使用 YAML 的建议是在应用程序端指定所需的结构——在 SnakeYaml 中,您将设置根 class 类型,然后每个值将被映射到其所需的类型层次结构中的点,只要它能够映射到那里,无论它是被引用还是不被引用。通常,应用程序指定它在整个层次结构中期望哪种值比 YAML 作者通过引用来指定更有意义。这也符合 YAML 规范,它说
Resolving the tag of a node must only depend on the following three parameters: (1) the non-specific tag of the node, (2) the path leading from the root to the node, and (3) the content (and hence the kind) of the node.
解析标签 是用于确定目标类型的 YAML 术语。并且允许根据其在层次结构中的位置确定目标类型:根类型由元素是 YAML 文档的根这一事实决定,在 SnakeYaml 的情况下,可以通过 API。所有其他类型都由它们是根类型的后代这一事实决定。
最后说明:如果您真的希望某个东西成为字符串,!!str 2018-04-17
就可以了,因为它为节点设置了特定的标签。