UTF-32 和 UCS-4 有什么区别？

What is the difference between UTF-32 and UCS-4?

UTF-32 和 UCS-4 有什么区别？ UTF-32 不应该是固定宽度的编码吗？

UTF-32 已作为 UCS-4 的子集开始。现在除了 UTF-32 标准具有额外的 Unicode 语义外，它是相同的。查看 wikipedia 的详细信息：

The original ISO 10646 standard defines a 31-bit encoding form called UCS-4, in which each encoded character in the Universal Character Set (UCS) is represented by a 32-bit friendly code value in the code space of integers between 0 and hexadecimal 7FFFFFFF.

Because only 17 planes are actually in use, all current code points are between 0 and 0x10FFFF. UTF-32 is a subset of UCS-4 that uses only this range. Since the Principles and Procedures document of JTC1/SC2/WG2 states that all future assignments of characters will be constrained to the BMP or the first 14 supplementary planes, UTF-32 will be able to represent all Unicode characters. Accordingly, UCS-4 and UTF-32 are now identical except that the UTF-32 standard has additional Unicode semantics.

不过，我不太确定additional Unicode semantics是什么意思。也许有人可以提供更好的答案。

Unicode Standard Version 8.0, Appendix C 状态：

UCS-4 stands for “Universal Character Set coded in 4 octets.” It is now treated simply as a synonym for UTF-32, and is considered the canonical form for representation of characters in ISO 10646 (Universal Coded Character Set).

UTF-32 和 UCS-4 有什么区别？

What is the difference between UTF-32 and UCS-4?

string

unicode

encoding

char

utf