XML 允许的字符
XML allowable characters
在高层次上,以下字符代码在 XML 中增加了对什么的支持?
[#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] |
[#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
参考:https://www.w3.org/TR/xml/#NT-NameStartChar
我可以查各种字,例如:
À latin capital letter a with grave 0300 192 0xC0 À
但我想知道是否有人可以在较高的层次上解释这允许 - 和不允许 - 因为范围之间存在差距(例如,0xF7
) .
命名规则背后的基本原理总结在同一个linked page。
The first character of a Name must be a NameStartChar, and any other characters must be NameChars; this mechanism is used to prevent names from beginning with European (ASCII) digits or with basic combining characters.
Almost all characters are permitted in names, except those which either are or reasonably could be used as delimiters.
The ASCII symbols and punctuation marks, along with a fairly large group of Unicode symbol characters, are excluded from names because they are more useful as delimiters in contexts where XML names are used outside XML documents
例如,检查 Unicode blocks finds that x300-x36F
are Combining Diacritical Marks, and x2190-x21FF
are Arrows 的列表,这解释了为什么两个范围都被排除在引用列表之外。
更具体地说,关于 Character Classes describes the name rules in terms of Unicode Categories 的部分(有一些例外和说明单独注明,未在下面复制)。
Name start characters must have one of the categories Ll, Lu, Lo, Lt, Nl.
- Ll - Letter, uppercase
- Lu - Letter, lowercase
- Lo - Letter, other (an ideograph or a letter in a unicase alphabet)
- Lt - Letter, titlecase (ligatures containing uppercase followed by lowercase)
- Nl - Number, letter (numerals composed of letters or letterlike symbols)
Name characters other than Name-start characters must have one of the categories Mc, Me, Mn, Lm, or Nd.
- Mc - Mark, spacing combining
- Me - Mark, enclosing
- Mn - Mark, nonspacing
- Lm - Letter, modifier (incl. diacritics)
- Nd - Number, decimal digit
在高层次上,以下字符代码在 XML 中增加了对什么的支持?
[#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] |
[#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
参考:https://www.w3.org/TR/xml/#NT-NameStartChar
我可以查各种字,例如:
À latin capital letter a with grave 0300 192 0xC0 À
但我想知道是否有人可以在较高的层次上解释这允许 - 和不允许 - 因为范围之间存在差距(例如,0xF7
) .
命名规则背后的基本原理总结在同一个linked page。
The first character of a Name must be a NameStartChar, and any other characters must be NameChars; this mechanism is used to prevent names from beginning with European (ASCII) digits or with basic combining characters.
Almost all characters are permitted in names, except those which either are or reasonably could be used as delimiters.
The ASCII symbols and punctuation marks, along with a fairly large group of Unicode symbol characters, are excluded from names because they are more useful as delimiters in contexts where XML names are used outside XML documents
例如,检查 Unicode blocks finds that x300-x36F
are Combining Diacritical Marks, and x2190-x21FF
are Arrows 的列表,这解释了为什么两个范围都被排除在引用列表之外。
更具体地说,关于 Character Classes describes the name rules in terms of Unicode Categories 的部分(有一些例外和说明单独注明,未在下面复制)。
Name start characters must have one of the categories Ll, Lu, Lo, Lt, Nl.
- Ll - Letter, uppercase
- Lu - Letter, lowercase
- Lo - Letter, other (an ideograph or a letter in a unicase alphabet)
- Lt - Letter, titlecase (ligatures containing uppercase followed by lowercase)
- Nl - Number, letter (numerals composed of letters or letterlike symbols)
Name characters other than Name-start characters must have one of the categories Mc, Me, Mn, Lm, or Nd.
- Mc - Mark, spacing combining
- Me - Mark, enclosing
- Mn - Mark, nonspacing
- Lm - Letter, modifier (incl. diacritics)
- Nd - Number, decimal digit