NFKD 是否通过兼容性和规范等价来分解字符?

Does NFKD decompose characters by compatibility and also canonical equivalence?

在多次尝试理解之后,我得说我不明白 String.prototype.normalize() 是如何工作的。此方法可以采用一些值作为参数:NFCNFDNFKCNFKD.

首先,我不明白 NFDNFKD 之间有什么区别。规范对此非常模糊,所以......在一些 resource 中,我读到 NFD 通过规范等价分解字符。例如:

"â" (U+00E2) -> "a" (U+0061) + " ̂" (U+0302)

NFKD通过兼容性分解字符。例如:

"fi" (U+FB01) -> "f" (U+0066) + "i" (U+0069)

但事实并非如此。 NFKD不仅通过兼容性分解字符。它也可以完美地处理第一个例子:

let s = `\u00E2`; //"â" 
console.log(s.normalize('NFD').length); //2
console.log(s.normalize('NFKD').length); //2

是否意味着NFKD可以通过兼容性和规范等价来分解字符? NFD 仅通过规范等价分解字符...?

let s = `\uFB01`; //"fi"
console.log(s.normalize('NFD').length); //1

Unicode

The type of full decomposition chosen depends on which Unicode Normalization Form is involved. For NFC or NFD, one does a full canonical decomposition, which makes use of only canonical Decomposition_Mapping values. For NFKC or NFKD, one does a full compatibility decomposition, which makes use of canonical and compatibility Decomposition_Mapping values.

这就是 NFC/NFD 和 NFKC/NFKD 如此工作的原因:

let s1 = '\uFB00'; //"ff"
let s2 = '\u0066\u0066'; //"ff"
console.log(s1.normalize('NFD').length); //doesn't work with compatible -- only can. eq.

let t1 = `\u00F4`; //ô
let t2 = `\u006F\u0302`; //ô
console.log(t1.normalize('NFKD').length); //also works with can. eq.
console.log(t2.normalize('NFKC').length); //also works with can. eq.

这是完全可以理解的,因为...

MDN

All canonically equivalent sequences are also compatible, but not vice versa.