为什么 normalize() 方法在某些情况下不起作用？

Question

默认情况下，String.prototype.normalize() 使用 NFC 作为参数。 NFC 将多个字符替换为单个字符。

You can specify "NFC" to get the composed canonical form, in which multiple code points are replaced with single code points where possible.

这是来自 MDN 的示例。有效。

let str = '\u006E\u0303';
str = str.normalize();
console.log(`${str}: ${str.length}`);

但后来我决定用其他角色尝试这种方法。例如：

let str = '\u0057\u0303';
str = str.normalize();
console.log(`${str}: ${str.length}`);

第二个例子有什么问题？为什么不起作用？

Answer 1

它不会替换多个字符它会替换多个 代码点 并且仅在可能的情况下 .

ñ，a character used in Spanish 在 unicode 中有自己的代码点：— U+00D1 — 所以你可以只说 ñ 而不是“Take an n and然后在上面放一个~。

W̃，a representation of a phonic sound 没有自己的代码点。它是一个相对很少使用的字符，因此在更有效的 Unicode 位中没有给予宝贵的 space。唯一的方法就是说“拿一个 W，然后在上面放一个 ~”。

Why normalize() method doesn't work in some cases?