Unicode 和 html 个实体

Unicode and html emtities

我有一个带有 charset=utf-8 和 运行 Bootstrap 3.7 以及 html5 doctype 的站点。

好像有人插入这个:

<div>&#139;text&#155;</div>

它以错误的方式呈现 - 像这样:

?tekst?

如果我们使用它就有效:

<div>&#8249;text&#8250;</div>

<div>&lsaquo;text&rsaquo;</div>

这给了我们:‹text› 但我想知道为什么第一组unicode字体不起作用?

this W3C Recommendation 中所定义(HTML 也支持):

Character and Entity References

[Definition: A character reference refers to a specific character in the ISO/IEC 10646 character set, for example one not directly accessible from available input devices.]

Character Reference

CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'

Well-formedness constraint: Legal Character. Characters referred to using character references must match the production for Char (any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.).

If the character reference begins with " &#x ", the digits and letters up to the terminating ; provide a hexadecimal representation of the character's code point in ISO/IEC 10646.
If it begins just with " &# ", the digits up to the terminating ; provide a decimal representation of the character's code point.

重要character codes … are synchronized between Unicode and ISO/IEC 10646