Go rune literal for high positioned emojis

Question

我们如何使用符文文字超出我想象的表情符号代码点 U+265F?

a1 := '\u2665'

这有效

a2 := '\u1F3A8'

这会给出错误的无效字符文字，多于一个字符。

有没有办法将位置较高的表情符号表示为符文文字？

https://unicode.org/emoji/charts/full-emoji-list.html

Answer 1

您可以使用 \U 序列，后跟 8 个十六进制数字 ，这是 Unicode 代码点的十六进制表示。这在 Spec: Rune literals:

中有详细说明

There are four ways to represent the integer value as a numeric constant: \x followed by exactly two hexadecimal digits; \u followed by exactly four hexadecimal digits; \U followed by exactly eight hexadecimal digits, and a plain backslash \ followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base.

例如：

a1 := '\u2665'
fmt.Printf("%c\n", a1)

a2 := '\U0001F3A8'
fmt.Printf("%c\n", a2)

哪些输出（在 Go Playground 上尝试）：

♥

注（回复@torek）：

我相信 Go 作者选择要求恰好 4 和 8 个十六进制数字，因为这允许在解释的字符串文字中使用 完全相同的形式，完全相同的符文文字 。例如。如果您想要一个包含 2 个符文的字符串，一个代码点为 0x0001F3A8，另一个代码点为 4，它可能如下所示：

s := "\U0001F3A84"

如果规范不要求恰好 8 个十六进制数字，则最后一个 '4' 是代码点的一部分还是字符串的单个符文将是不明确的，因此您必须打破string 到 "\U1F3A8" + "4".

这样的串联

Spec: String literals:

Interpreted string literals are character sequences between double quotes, as in "bar". Within the quotes, any character may appear except newline and unescaped double quote. The text between the quotes forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal 7 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF.

Go rune literal for high positioned emojis

Go rune literal for high positioned emojis

unicode

go

emoji

rune