给定 Unicode 代码点的编号,如何获取该字符的 String 或 CharSequence 对象

Given the number of a Unicode code point, how can I obtain a String or CharSequence object for that character

我在Java看到了关于获取Unicode字符码位号的问答。例如,问题 How can I get a Unicode character's code?

但我想要相反的结果:给定一个整数,如何获取分配给那个 code point 数字的那个字符的文本?

char原始数据类型没有用,仅限于Basic Multilingual Plane of the Unicode character set. That plane represents approximately the first 64,000 characters defined in Unicode. But Unicode has grown to nearly double that, over 113,000 characters defined now. The numbers assigned to characters range over a million. Being based on 16-bitschar限制在64K范围内,不够用

Character and String classes offer the method codePointAt 用于检查字符,return int 表示在 Unicode 中分配的代码点。我正在寻找相反的东西。

➥ 给定一个 int,如何获得一个 Character, String, or some implementation of CharSequence 的对象,然后我可以加入其他文本?

在编写字符串文字时,我们可以使用带有反斜杠-with-u 的 Unicode 转义序列。但我感兴趣的是使用整数变量,软编码而不是硬编码 Unicode 字符。

tl;博士

String s = Character.toString( 128_567 ) ;

详情

您要求 CharacterString 的对象或 CharSequence 的某种实现。

Character

Character class 实际上是 legacy,只是原始 char 类型的对象包装器。 char 类型也是遗留的,在内部被定义为一个 16 位数字,仅限于 Unicode 代码点的前 64K。 Unicode 现在分配给字符的代码点数是其两倍多,因此 char 无法表示大多数字符。

所以我们不能为 Basic Multilingual Plane 字符集之外的字符实例化 Character 对象。因此,作为解决方法,Character.toString( int ) 生成包含单个字符的 StringString 可以 处理任何和所有 Unicode 字符,而 Character 不能。

String Character.toString( int )

获得String object containing a single character determined by an int, pass the int to Character.toString().

例如,我们使用FACE WITH MEDICAL MASK, an emoji character at U+1F637(十进制:128,567)。

// -----|  input  |----------------
String input = "" ;                                 // FACE WITH MEDICAL MASK at code point U+1F637 (decimal: 128,567).
int codePoint = input.codePointAt( 0 ) ;              // Returns 128,567. 
System.out.println( "codePoint : " + codePoint ) ;   

codePoint : 128567

int 原始变量转换为 String

// -----|  String  |----------------
String output = Character.toString( codePoint ) ;     // Pass an `int` primitive integer number.
System.out.println( "output : " + output ) ; 

output :

或使用文字整数。

String output2 = Character.toString( 128_567 ) ;      // Pass an integer literal.
System.out.println( "output2 : " + output2 ) ;

output2 :

看到这个code run live at IdeOne.com

CharSequence

上面的代码有效,如String is an implementation of CharSequence

CharSequence cs = Character.toString( 128_567 ) ;     // Returns a `String` which is a `CharSequence`. 

appendCodePoint

StringBuilder class offers a method appendCodePoint to add a character via its assigned Unicode code point number. Ditto for thread-safe StringBuffer.