Perl:将字节序列打包成字符串
Perl: Packing a sequence of bytes into a string
我正在尝试 运行 一个简单的测试,借此我想要不同格式的二进制字符串并将它们打印出来。事实上,我正在尝试调查 sprintf
无法处理为占位符 %s
.
传入的宽字符字符串的问题
在这种情况下,二进制字符串应只包含西里尔字母“д”(因为它高于 ISO-8859-1)
当我直接在源代码中使用字符时,下面的代码有效。
但是通过 pack
的任何东西都不起作用。
- 对于 UTF-8 的情况,我需要在字符串
$ch
上设置 UTF-8 标志,但是如何设置。
- UCS-2 案例失败了,我想这是因为 ISO-8859-1 中的 Perl UCS-2 没有办法,所以该测试可能是胡说八道,对吧?
代码:
#!/usr/bin/perl
use utf8; # Meaning "This lexical scope (i.e. file) contains utf8"
# https://perldoc.perl.org/open.html
use open qw(:std :encoding(UTF-8));
sub showme {
my ($name,$ch) = @_;
print "-------\n";
print "This is test: $name\n";
my $ord = ord($ch); # ordinal computed outside of "use bytes"; actually should yield the unicode codepoint
{
# https://perldoc.perl.org/bytes.html
use bytes;
my $mark = (utf8::is_utf8($ch) ? "yes" : "no");
my $txt = sprintf("Received string of length: %i byte, contents: %vd, ordinal x%04X, utf-8: %s\n", length($ch), $ch, $ord, $mark);
print $txt,"\n";
}
print $ch, "\n";
print "Combine: $ch\n";
print "Concat: " . $ch . "\n";
print "Sprintf: " . sprintf("%s",$ch) . "\n";
print "-------\n";
}
showme("Cryillic direct" , "д");
showme("Cyrillic UTF-8" , pack("HH","D0","B4")); # UTF-8 of д is D0B4
showme("Cyrillic UCS-2" , pack("HH","04","34")); # UCS-2 of д is 0434
当前输出:
看起来不错
-------
This is test: Cryillic direct
Received string of length: 2 byte, contents: 208.180, ordinal x0434, utf-8: yes
д
Combine: д
Concat: д
Sprintf: д
-------
不行。 176从哪里来??
-------
This is test: Cyrillic UTF-8
Received string of length: 2 byte, contents: 208.176, ordinal x00D0, utf-8: no
а
Combine: а
Concat: а
Sprintf: а
-------
这更糟。
-------
This is test: Cyrillic UCS-2
Received string of length: 2 byte, contents: 0.48, ordinal x0000, utf-8: no
0
Combine: 0
Concat: 0
Sprintf: 0
-------
你有两个问题。
您对 pack
的调用不正确
每个 H
代表一个十六进制数字。
$ perl -e'printf "%vX\n", pack("HH", "D0", "B4")' # XXX
D0.B0
$ perl -e'printf "%vX\n", pack("H2H2", "D0", "B4")' # Ok
D0.B4
$ perl -e'printf "%vX\n", pack("(H2)2", "D0", "B4")' # Ok
D0.B4
$ perl -e'printf "%vX\n", pack("(H2)*", "D0", "B4")' # Better
D0.B4
$ perl -e'printf "%vX\n", pack("H*", "D0B4")' # Alternative
D0.B4
STDOUT 需要解码文本,但您提供的是编码文本
首先,让我们看一下您正在生成的字符串(一旦解决了上述问题)。您所需要的只是 %vX
格式,它以十六进制形式提供每个字符的句点分隔值。
"д"
生成单字符字符串。此字符是 д
.
的 Unicode 代码点
$ perl -e'use utf8; printf("%vX\n", "д");'
434
pack("H*", "D0B4")
生成两个字符的字符串。这些字符是д
.
的UTF-8编码
$ perl -e'printf("%vX\n", pack("H*", "D0B4"));'
D0.B4
pack("H*", "0434")
生成两个字符的字符串。这些字符是 д
.
的 UCS-2be 和 UTF-16be 编码
$ perl -e'printf("%vX\n", pack("H*", "0434"));'
4.34
通常,文件句柄需要打印一串字节(值为 0..255 的字符)。这些字节是逐字输出的。[1][2]
当一个编码层(例如 :encoding(UTF-8)
)被添加到一个文件句柄时,它期望将一串 Unicode 代码点(又名解码文本)打印到它。
您的程序向 STDOUT
添加了一个编码层(通过使用 use open
pragma),因此您必须向 print
和 [=30 提供 UCP(解码文本) =].例如,您可以使用 Encode 的 decode
函数从编码文本中获取解码文本。
use utf8;
use open qw( :std :encoding(UTF-8) );
use feature qw( say );
use Encode qw( decode );
say "д"; # ok (UCP of "д")
say pack("H*", "D0B4"); # XXX (UTF-8 encoding of "д")
say pack("H*", "0434"); # XXX (UCS-2be and UTF-16be encoding of "д")
say decode("UTF-8", pack("H*", "D0B4")); # ok (UCP of "д")
say decode("UCS-2be", pack("H*", "0434")); # ok (UCP of "д")
say decode("UTF-16be", pack("H*", "0434")); # ok (UCP of "д")
For the UTF-8 case, I need to set the UTF-8 flag on
不,您需要对字符串进行解码。
UTF-8 标志无关紧要。最初是否设置标志是无关紧要的。字符串解码后是否设置标志是无关紧要的。该标志指示字符串在内部如何存储,这是您不应该关心的。
例如取
use strict;
use warnings;
use open qw( :std :encoding(UTF-8) );
use feature qw( say );
my $x = chr(0xE9);
utf8::downgrade($x); # Tell Perl to use the UTF8=0 storage format.
say sprintf "%s %vX %s", utf8::is_utf8($x) ? "UTF8=1" : "UTF8=0", $x, $x;
utf8::upgrade($x); # Tell Perl to use the UTF8=1 storage format.
say sprintf "%s %vX %s", utf8::is_utf8($x) ? "UTF8=1" : "UTF8=0", $x, $x;
输出
UTF8=0 E9 é
UTF8=1 E9 é
不考虑UTF8
标志,输出提供的UCP(U+00E9)的UTF-8编码(C3 A9
)。
I suppose it's because there is no way for Perl UCS-2 from ISO-8859-1, so that test is probably bollocks, right?
充其量,人们可以使用试探法来猜测字符串是使用 iso-latin-1 还是 UCS-2be 编码的。我怀疑可以得到相当准确的结果(比如 those 你会得到 iso-latin-1 和 UTF-8。)
我不确定你为什么要提到 iso-latin-1,因为你的问题中没有其他内容与 iso-latin-1 相关。
除了 Windows,其中默认添加了一个 :crlf
图层到句柄。
如果您提供的字符串包含非字节字符,并且输出字符串的 utf8
编码,则会收到 Wide character
警告。
请看下面的演示代码是否有帮助
use strict;
use warnings;
use feature 'say';
use utf8; # https://perldoc.perl.org/utf8.html
use Encode; # https://perldoc.perl.org/Encode.html
my $str;
my $utf8 = 'Привет Москва';
my $ucs2le = '1f044004380432043504420420001c043e0441043a0432043004'; # Little Endian
my $ucs2be = '041f044004380432043504420020041c043e0441043a04320430'; # Big Endian
my $utf16 = '041f044004380432043504420020041c043e0441043a04320430';
my $utf32 = '0000041f0000044000000438000004320000043500000442000000200000041c0000043e000004410000043a0000043200000430';
# https://perldoc.perl.org/functions/binmode.html
binmode STDOUT, ':utf8';
# https://perldoc.perl.org/feature.html#The-'say'-feature
say 'UTF-8: ' . $utf8;
# https://perldoc.perl.org/Encode.html#THE-PERL-ENCODING-API
$str = pack('H*',$ucs2be);
say 'UCS-2BE: ' . decode('UCS-2BE',$str);
$str = pack('H*',$ucs2le);
say 'UCS-2LE: ' . decode('UCS-2LE',$str);
$str = pack('H*',$utf16);
say 'UTF-16: '. decode('UTF16',$str);
$str = pack('H*',$utf32);
say 'UTF-32: ' . decode('UTF32',$str);
输出
UTF-8: Привет Москва
UCS-2BE: Привет Москва
UCS-2LE: Привет Москва
UTF-16: Привет Москва
UTF-32: Привет Москва
支持的西里尔编码
use strict;
use warnings;
use feature 'say';
use Encode;
use utf8;
binmode STDOUT, ':utf8';
my $utf8 = 'Привет Москва';
my @encodings = qw/UCS-2 UCS-2LE UCS-2BE UTF-16 UTF-32 ISO-8859-5 CP855 CP1251 KOI8-F KOI8-R KOI8-U/;
say '
:: Supported Cyrillic encoding
---------------------------------------------
UTF-8 ', $utf8;
for (@encodings) {
printf "%-11s %s\n", $_, unpack('H*', encode($_,$utf8));
}
输出
:: Supported Cyrillic encoding
---------------------------------------------
UTF-8 Привет Москва
UCS-2 041f044004380432043504420020041c043e0441043a04320430
UCS-2LE 1f044004380432043504420420001c043e0441043a0432043004
UCS-2BE 041f044004380432043504420020041c043e0441043a04320430
UTF-16 feff041f044004380432043504420020041c043e0441043a04320430
UTF-32 0000feff0000041f0000044000000438000004320000043500000442000000200000041c0000043e000004410000043a0000043200000430
ISO-8859-5 bfe0d8d2d5e220bcdee1dad2d0
CP855 dde1b7eba8e520d3d6e3c6eba0
CP1251 cff0e8e2e5f220cceef1eae2e0
KOI8-F f0d2c9d7c5d420edcfd3cbd7c1
KOI8-R f0d2c9d7c5d420edcfd3cbd7c1
KOI8-U f0d2c9d7c5d420edcfd3cbd7c1
两个都是很好的答案。下面是 Polar Bear 代码的轻微扩展,用于打印有关字符串的详细信息:
use strict;
use warnings;
use feature 'say';
use utf8;
use Encode;
sub about {
my($str) = @_;
# https://perldoc.perl.org/bytes.html
my $charlen = length($str);
my $txt;
{
use bytes;
my $mark = (utf8::is_utf8($str) ? "yes" : "no");
my $bytelen = length($str);
$txt = sprintf("Length: %d byte, %d chars, utf-8: %s, contents: %vd\n",
$bytelen,$charlen,$mark,$str);
}
return $txt;
}
my $str;
my $utf8 = 'Привет Москва';
my $ucs2le = '1f044004380432043504420420001c043e0441043a0432043004'; # Little Endian
my $ucs2be = '041f044004380432043504420020041c043e0441043a04320430'; # Big Endian
my $utf16 = '041f044004380432043504420020041c043e0441043a04320430';
my $utf32 = '0000041f0000044000000438000004320000043500000442000000200000041c0000043e000004410000043a0000043200000430';
binmode STDOUT, ':utf8';
say 'UTF-8: ' . $utf8;
say about($utf8);
{
my $str = pack('H*',$ucs2be);
say 'UCS-2BE: ' . decode('UCS-2BE',$str);
say about($str);
}
{
my $str = pack('H*',$ucs2le);
say 'UCS-2LE: ' . decode('UCS-2LE',$str);
say about($str);
}
{
my $str = pack('H*',$utf16);
say 'UTF-16: '. decode('UTF16',$str);
say about($str);
}
{
my $str = pack('H*',$utf32);
say 'UTF-32: ' . decode('UTF32',$str);
say about($str);
}
# Try identity transcoding
{
my $str_encoded_in_utf16 = encode('UTF16',$utf8);
my $str = decode('UTF16',$str_encoded_in_utf16);
say 'The same: ' . $str;
say about($str);
}
运行 这给出:
UTF-8: Привет Москва
Length: 25 byte, 13 chars, utf-8: yes, contents: 208.159.209.128.208.184.208.178.208.181.209.130.32.208.156.208.190.209.129.208.186.208.178.208.176
UCS-2BE: Привет Москва
Length: 26 byte, 26 chars, utf-8: no, contents: 4.31.4.64.4.56.4.50.4.53.4.66.0.32.4.28.4.62.4.65.4.58.4.50.4.48
UCS-2LE: Привет Москва
Length: 26 byte, 26 chars, utf-8: no, contents: 31.4.64.4.56.4.50.4.53.4.66.4.32.0.28.4.62.4.65.4.58.4.50.4.48.4
UTF-16: Привет Москва
Length: 26 byte, 26 chars, utf-8: no, contents: 4.31.4.64.4.56.4.50.4.53.4.66.0.32.4.28.4.62.4.65.4.58.4.50.4.48
UTF-32: Привет Москва
Length: 52 byte, 52 chars, utf-8: no, contents: 0.0.4.31.0.0.4.64.0.0.4.56.0.0.4.50.0.0.4.53.0.0.4.66.0.0.0.32.0.0.4.28.0.0.4.62.0.0.4.65.0.0.4.58.0.0.4.50.0.0.4.48
The same: Привет Москва
Length: 25 byte, 13 chars, utf-8: yes, contents: 208.159.209.128.208.184.208.178.208.181.209.130.32.208.156.208.190.209.129.208.186.208.178.208.176
我做了一个小图作为下次的概述,涵盖 encode
、decode
和 pack
。因为最好为下一次做好准备。
(上图及其 graphml
文件可用 here)
我正在尝试 运行 一个简单的测试,借此我想要不同格式的二进制字符串并将它们打印出来。事实上,我正在尝试调查 sprintf
无法处理为占位符 %s
.
在这种情况下,二进制字符串应只包含西里尔字母“д”(因为它高于 ISO-8859-1)
当我直接在源代码中使用字符时,下面的代码有效。
但是通过 pack
的任何东西都不起作用。
- 对于 UTF-8 的情况,我需要在字符串
$ch
上设置 UTF-8 标志,但是如何设置。 - UCS-2 案例失败了,我想这是因为 ISO-8859-1 中的 Perl UCS-2 没有办法,所以该测试可能是胡说八道,对吧?
代码:
#!/usr/bin/perl
use utf8; # Meaning "This lexical scope (i.e. file) contains utf8"
# https://perldoc.perl.org/open.html
use open qw(:std :encoding(UTF-8));
sub showme {
my ($name,$ch) = @_;
print "-------\n";
print "This is test: $name\n";
my $ord = ord($ch); # ordinal computed outside of "use bytes"; actually should yield the unicode codepoint
{
# https://perldoc.perl.org/bytes.html
use bytes;
my $mark = (utf8::is_utf8($ch) ? "yes" : "no");
my $txt = sprintf("Received string of length: %i byte, contents: %vd, ordinal x%04X, utf-8: %s\n", length($ch), $ch, $ord, $mark);
print $txt,"\n";
}
print $ch, "\n";
print "Combine: $ch\n";
print "Concat: " . $ch . "\n";
print "Sprintf: " . sprintf("%s",$ch) . "\n";
print "-------\n";
}
showme("Cryillic direct" , "д");
showme("Cyrillic UTF-8" , pack("HH","D0","B4")); # UTF-8 of д is D0B4
showme("Cyrillic UCS-2" , pack("HH","04","34")); # UCS-2 of д is 0434
当前输出:
看起来不错
-------
This is test: Cryillic direct
Received string of length: 2 byte, contents: 208.180, ordinal x0434, utf-8: yes
д
Combine: д
Concat: д
Sprintf: д
-------
不行。 176从哪里来??
-------
This is test: Cyrillic UTF-8
Received string of length: 2 byte, contents: 208.176, ordinal x00D0, utf-8: no
а
Combine: а
Concat: а
Sprintf: а
-------
这更糟。
-------
This is test: Cyrillic UCS-2
Received string of length: 2 byte, contents: 0.48, ordinal x0000, utf-8: no
0
Combine: 0
Concat: 0
Sprintf: 0
-------
你有两个问题。
您对 pack
的调用不正确
每个 H
代表一个十六进制数字。
$ perl -e'printf "%vX\n", pack("HH", "D0", "B4")' # XXX
D0.B0
$ perl -e'printf "%vX\n", pack("H2H2", "D0", "B4")' # Ok
D0.B4
$ perl -e'printf "%vX\n", pack("(H2)2", "D0", "B4")' # Ok
D0.B4
$ perl -e'printf "%vX\n", pack("(H2)*", "D0", "B4")' # Better
D0.B4
$ perl -e'printf "%vX\n", pack("H*", "D0B4")' # Alternative
D0.B4
STDOUT 需要解码文本,但您提供的是编码文本
首先,让我们看一下您正在生成的字符串(一旦解决了上述问题)。您所需要的只是 %vX
格式,它以十六进制形式提供每个字符的句点分隔值。
的 Unicode 代码点"д"
生成单字符字符串。此字符是д
.$ perl -e'use utf8; printf("%vX\n", "д");' 434
的UTF-8编码pack("H*", "D0B4")
生成两个字符的字符串。这些字符是д
.$ perl -e'printf("%vX\n", pack("H*", "D0B4"));' D0.B4
的 UCS-2be 和 UTF-16be 编码pack("H*", "0434")
生成两个字符的字符串。这些字符是д
.$ perl -e'printf("%vX\n", pack("H*", "0434"));' 4.34
通常,文件句柄需要打印一串字节(值为 0..255 的字符)。这些字节是逐字输出的。[1][2]
当一个编码层(例如 :encoding(UTF-8)
)被添加到一个文件句柄时,它期望将一串 Unicode 代码点(又名解码文本)打印到它。
您的程序向 STDOUT
添加了一个编码层(通过使用 use open
pragma),因此您必须向 print
和 [=30 提供 UCP(解码文本) =].例如,您可以使用 Encode 的 decode
函数从编码文本中获取解码文本。
use utf8;
use open qw( :std :encoding(UTF-8) );
use feature qw( say );
use Encode qw( decode );
say "д"; # ok (UCP of "д")
say pack("H*", "D0B4"); # XXX (UTF-8 encoding of "д")
say pack("H*", "0434"); # XXX (UCS-2be and UTF-16be encoding of "д")
say decode("UTF-8", pack("H*", "D0B4")); # ok (UCP of "д")
say decode("UCS-2be", pack("H*", "0434")); # ok (UCP of "д")
say decode("UTF-16be", pack("H*", "0434")); # ok (UCP of "д")
For the UTF-8 case, I need to set the UTF-8 flag on
不,您需要对字符串进行解码。
UTF-8 标志无关紧要。最初是否设置标志是无关紧要的。字符串解码后是否设置标志是无关紧要的。该标志指示字符串在内部如何存储,这是您不应该关心的。
例如取
use strict;
use warnings;
use open qw( :std :encoding(UTF-8) );
use feature qw( say );
my $x = chr(0xE9);
utf8::downgrade($x); # Tell Perl to use the UTF8=0 storage format.
say sprintf "%s %vX %s", utf8::is_utf8($x) ? "UTF8=1" : "UTF8=0", $x, $x;
utf8::upgrade($x); # Tell Perl to use the UTF8=1 storage format.
say sprintf "%s %vX %s", utf8::is_utf8($x) ? "UTF8=1" : "UTF8=0", $x, $x;
输出
UTF8=0 E9 é
UTF8=1 E9 é
不考虑UTF8
标志,输出提供的UCP(U+00E9)的UTF-8编码(C3 A9
)。
I suppose it's because there is no way for Perl UCS-2 from ISO-8859-1, so that test is probably bollocks, right?
充其量,人们可以使用试探法来猜测字符串是使用 iso-latin-1 还是 UCS-2be 编码的。我怀疑可以得到相当准确的结果(比如 those 你会得到 iso-latin-1 和 UTF-8。)
我不确定你为什么要提到 iso-latin-1,因为你的问题中没有其他内容与 iso-latin-1 相关。
除了 Windows,其中默认添加了一个
:crlf
图层到句柄。如果您提供的字符串包含非字节字符,并且输出字符串的
utf8
编码,则会收到Wide character
警告。
请看下面的演示代码是否有帮助
use strict;
use warnings;
use feature 'say';
use utf8; # https://perldoc.perl.org/utf8.html
use Encode; # https://perldoc.perl.org/Encode.html
my $str;
my $utf8 = 'Привет Москва';
my $ucs2le = '1f044004380432043504420420001c043e0441043a0432043004'; # Little Endian
my $ucs2be = '041f044004380432043504420020041c043e0441043a04320430'; # Big Endian
my $utf16 = '041f044004380432043504420020041c043e0441043a04320430';
my $utf32 = '0000041f0000044000000438000004320000043500000442000000200000041c0000043e000004410000043a0000043200000430';
# https://perldoc.perl.org/functions/binmode.html
binmode STDOUT, ':utf8';
# https://perldoc.perl.org/feature.html#The-'say'-feature
say 'UTF-8: ' . $utf8;
# https://perldoc.perl.org/Encode.html#THE-PERL-ENCODING-API
$str = pack('H*',$ucs2be);
say 'UCS-2BE: ' . decode('UCS-2BE',$str);
$str = pack('H*',$ucs2le);
say 'UCS-2LE: ' . decode('UCS-2LE',$str);
$str = pack('H*',$utf16);
say 'UTF-16: '. decode('UTF16',$str);
$str = pack('H*',$utf32);
say 'UTF-32: ' . decode('UTF32',$str);
输出
UTF-8: Привет Москва
UCS-2BE: Привет Москва
UCS-2LE: Привет Москва
UTF-16: Привет Москва
UTF-32: Привет Москва
支持的西里尔编码
use strict;
use warnings;
use feature 'say';
use Encode;
use utf8;
binmode STDOUT, ':utf8';
my $utf8 = 'Привет Москва';
my @encodings = qw/UCS-2 UCS-2LE UCS-2BE UTF-16 UTF-32 ISO-8859-5 CP855 CP1251 KOI8-F KOI8-R KOI8-U/;
say '
:: Supported Cyrillic encoding
---------------------------------------------
UTF-8 ', $utf8;
for (@encodings) {
printf "%-11s %s\n", $_, unpack('H*', encode($_,$utf8));
}
输出
:: Supported Cyrillic encoding
---------------------------------------------
UTF-8 Привет Москва
UCS-2 041f044004380432043504420020041c043e0441043a04320430
UCS-2LE 1f044004380432043504420420001c043e0441043a0432043004
UCS-2BE 041f044004380432043504420020041c043e0441043a04320430
UTF-16 feff041f044004380432043504420020041c043e0441043a04320430
UTF-32 0000feff0000041f0000044000000438000004320000043500000442000000200000041c0000043e000004410000043a0000043200000430
ISO-8859-5 bfe0d8d2d5e220bcdee1dad2d0
CP855 dde1b7eba8e520d3d6e3c6eba0
CP1251 cff0e8e2e5f220cceef1eae2e0
KOI8-F f0d2c9d7c5d420edcfd3cbd7c1
KOI8-R f0d2c9d7c5d420edcfd3cbd7c1
KOI8-U f0d2c9d7c5d420edcfd3cbd7c1
两个都是很好的答案。下面是 Polar Bear 代码的轻微扩展,用于打印有关字符串的详细信息:
use strict;
use warnings;
use feature 'say';
use utf8;
use Encode;
sub about {
my($str) = @_;
# https://perldoc.perl.org/bytes.html
my $charlen = length($str);
my $txt;
{
use bytes;
my $mark = (utf8::is_utf8($str) ? "yes" : "no");
my $bytelen = length($str);
$txt = sprintf("Length: %d byte, %d chars, utf-8: %s, contents: %vd\n",
$bytelen,$charlen,$mark,$str);
}
return $txt;
}
my $str;
my $utf8 = 'Привет Москва';
my $ucs2le = '1f044004380432043504420420001c043e0441043a0432043004'; # Little Endian
my $ucs2be = '041f044004380432043504420020041c043e0441043a04320430'; # Big Endian
my $utf16 = '041f044004380432043504420020041c043e0441043a04320430';
my $utf32 = '0000041f0000044000000438000004320000043500000442000000200000041c0000043e000004410000043a0000043200000430';
binmode STDOUT, ':utf8';
say 'UTF-8: ' . $utf8;
say about($utf8);
{
my $str = pack('H*',$ucs2be);
say 'UCS-2BE: ' . decode('UCS-2BE',$str);
say about($str);
}
{
my $str = pack('H*',$ucs2le);
say 'UCS-2LE: ' . decode('UCS-2LE',$str);
say about($str);
}
{
my $str = pack('H*',$utf16);
say 'UTF-16: '. decode('UTF16',$str);
say about($str);
}
{
my $str = pack('H*',$utf32);
say 'UTF-32: ' . decode('UTF32',$str);
say about($str);
}
# Try identity transcoding
{
my $str_encoded_in_utf16 = encode('UTF16',$utf8);
my $str = decode('UTF16',$str_encoded_in_utf16);
say 'The same: ' . $str;
say about($str);
}
运行 这给出:
UTF-8: Привет Москва
Length: 25 byte, 13 chars, utf-8: yes, contents: 208.159.209.128.208.184.208.178.208.181.209.130.32.208.156.208.190.209.129.208.186.208.178.208.176
UCS-2BE: Привет Москва
Length: 26 byte, 26 chars, utf-8: no, contents: 4.31.4.64.4.56.4.50.4.53.4.66.0.32.4.28.4.62.4.65.4.58.4.50.4.48
UCS-2LE: Привет Москва
Length: 26 byte, 26 chars, utf-8: no, contents: 31.4.64.4.56.4.50.4.53.4.66.4.32.0.28.4.62.4.65.4.58.4.50.4.48.4
UTF-16: Привет Москва
Length: 26 byte, 26 chars, utf-8: no, contents: 4.31.4.64.4.56.4.50.4.53.4.66.0.32.4.28.4.62.4.65.4.58.4.50.4.48
UTF-32: Привет Москва
Length: 52 byte, 52 chars, utf-8: no, contents: 0.0.4.31.0.0.4.64.0.0.4.56.0.0.4.50.0.0.4.53.0.0.4.66.0.0.0.32.0.0.4.28.0.0.4.62.0.0.4.65.0.0.4.58.0.0.4.50.0.0.4.48
The same: Привет Москва
Length: 25 byte, 13 chars, utf-8: yes, contents: 208.159.209.128.208.184.208.178.208.181.209.130.32.208.156.208.190.209.129.208.186.208.178.208.176
我做了一个小图作为下次的概述,涵盖 encode
、decode
和 pack
。因为最好为下一次做好准备。
(上图及其 graphml
文件可用 here)