Go rune literal for high positioned emojis
Go rune literal for high positioned emojis
我们如何使用符文文字超出我想象的表情符号
代码点 U+265F?
a1 := '\u2665'
- 这有效
a2 := '\u1F3A8'
- 这会给出错误的无效字符文字,多于一个字符。
有没有办法将位置较高的表情符号表示为符文文字?
您可以使用 \U
序列,后跟 8 个十六进制数字 ,这是 Unicode 代码点的十六进制表示。这在 Spec: Rune literals:
中有详细说明
There are four ways to represent the integer value as a numeric constant: \x
followed by exactly two hexadecimal digits; \u
followed by exactly four hexadecimal digits; \U
followed by exactly eight hexadecimal digits, and a plain backslash \
followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base.
例如:
a1 := '\u2665'
fmt.Printf("%c\n", a1)
a2 := '\U0001F3A8'
fmt.Printf("%c\n", a2)
哪些输出(在 Go Playground 上尝试):
♥
注(回复@torek):
我相信 Go 作者选择要求恰好 4 和 8 个十六进制数字,因为这允许在解释的字符串文字中使用 完全相同的形式,完全相同的符文文字 。例如。如果您想要一个包含 2 个符文的字符串,一个代码点为 0x0001F3A8
,另一个代码点为 4
,它可能如下所示:
s := "\U0001F3A84"
如果规范不要求恰好 8 个十六进制数字,则最后一个 '4'
是代码点的一部分还是字符串的单个符文将是不明确的,因此您必须打破string
到 "\U1F3A8" + "4"
.
这样的串联
Interpreted string literals are character sequences between double quotes, as in "bar"
. Within the quotes, any character may appear except newline and unescaped double quote. The text between the quotes forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \'
is illegal and \"
is legal), with the same restrictions. The three-digit octal (\nnn
) and two-digit hexadecimal (\xnn
) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal 7
and \xFF
represent a single byte of value 0xFF=255
, while ÿ
, \u00FF
, \U000000FF
and \xc3\xbf
represent the two bytes 0xc3 0xbf
of the UTF-8 encoding of character U+00FF.
我们如何使用符文文字超出我想象的表情符号 代码点 U+265F?
a1 := '\u2665'
- 这有效
a2 := '\u1F3A8'
- 这会给出错误的无效字符文字,多于一个字符。
有没有办法将位置较高的表情符号表示为符文文字?
您可以使用 \U
序列,后跟 8 个十六进制数字 ,这是 Unicode 代码点的十六进制表示。这在 Spec: Rune literals:
There are four ways to represent the integer value as a numeric constant:
\x
followed by exactly two hexadecimal digits;\u
followed by exactly four hexadecimal digits;\U
followed by exactly eight hexadecimal digits, and a plain backslash\
followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base.
例如:
a1 := '\u2665'
fmt.Printf("%c\n", a1)
a2 := '\U0001F3A8'
fmt.Printf("%c\n", a2)
哪些输出(在 Go Playground 上尝试):
♥
注(回复@torek):
我相信 Go 作者选择要求恰好 4 和 8 个十六进制数字,因为这允许在解释的字符串文字中使用 完全相同的形式,完全相同的符文文字 。例如。如果您想要一个包含 2 个符文的字符串,一个代码点为 0x0001F3A8
,另一个代码点为 4
,它可能如下所示:
s := "\U0001F3A84"
如果规范不要求恰好 8 个十六进制数字,则最后一个 '4'
是代码点的一部分还是字符串的单个符文将是不明确的,因此您必须打破string
到 "\U1F3A8" + "4"
.
Interpreted string literals are character sequences between double quotes, as in
"bar"
. Within the quotes, any character may appear except newline and unescaped double quote. The text between the quotes forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that\'
is illegal and\"
is legal), with the same restrictions. The three-digit octal (\nnn
) and two-digit hexadecimal (\xnn
) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal7
and\xFF
represent a single byte of value0xFF=255
, whileÿ
,\u00FF
,\U000000FF
and\xc3\xbf
represent the two bytes0xc3 0xbf
of the UTF-8 encoding of character U+00FF.