Unicode 和 html 个实体

Question

我有一个带有 charset=utf-8 和运行 Bootstrap 3.7 以及 html5 doctype 的站点。

好像有人插入这个：

<div>&#139;text&#155;</div>

它以错误的方式呈现 - 像这样：

?tekst?

如果我们使用它就有效：

<div>&#8249;text&#8250;</div>

或

<div>&lsaquo;text&rsaquo;</div>

这给了我们：‹text› 但我想知道为什么第一组unicode字体不起作用？

Answer 1

如 this W3C Recommendation 中所定义（HTML 也支持）：

Character and Entity References

[Definition: A character reference refers to a specific character in the ISO/IEC 10646 character set, for example one not directly accessible from available input devices.]

Character Reference
CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'
Well-formedness constraint: Legal Character. Characters referred to using character references must match the production for Char (any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.).

If the character reference begins with " &#x ", the digits and letters up to the terminating ; provide a hexadecimal representation of the character's code point in ISO/IEC 10646.
If it begins just with " &# ", the digits up to the terminating ; provide a decimal representation of the character's code point.

重要：character codes … are synchronized between Unicode and ISO/IEC 10646

Unicode 和 html 个实体

Unicode and html emtities

html

unicode

ascii