Perl:以与输入文件相同的字节顺序打开输出文件 – UTF-16be 到。 UTF-16le
Perl: Open output file in same endianess as input file – UTF-16be vs. UTF-16le
当 Perl 打开一个 UTF-16 编码的文件时,
open my $in, "< :encoding(UTF-16)", "text-utf16le.txt" or die "Error $!\n";
它会自动检测 endianess thanks Byte Order Mark。
但是当我打开文件进行写入时
open my $out, "> :encoding(UTF-16)", "output.txt" or die "Error $!\n";
Perl 默认以 big endian 打开它。
如何指定以与输入文件相同的字节序打开输出文件?
如何从输入文件句柄$in
得到endianness/encoding? PerlIO::get_layers($in)
returns 其他图层 encoding(UTF-16)
.
您必须自己阅读 BOM。
use IO::Unread qw( unread );
open(my $fh_in, "<:raw", $qfn)
or die;
my $rv = read($fh_in, my $buf, 4);
defined($rv)
or die;
my $encoding;
my $bom_present;
if ($buf =~ s/^\x00\x00\xFE\xFF//) { $encoding = 'UTF-32be'; $bom_present = 1; }
elsif ($buf =~ s/^\xFF\xFE\x00\x00//) { $encoding = 'UTF-32le'; $bom_present = 1; }
elsif ($buf =~ s/^\xFE\xFF// ) { $encoding = 'UTF-16be'; $bom_present = 1; }
elsif ($buf =~ s/^\xFF\xFE// ) { $encoding = 'UTF-16le'; $bom_present = 1; }
elsif ($buf =~ s/^\xEF\xBB\xBF// ) { $encoding = 'UTF-8'; $bom_present = 1; }
else {
$encoding = 'UTF-8';
$bom_present = 0;
}
unread($fh_in, $buf) if length($buf);
binmode($fh_in, ":encoding($encoding)");
binmode($fh_in, ":crlf") if $^O eq 'MSWin32';
但是已经有人为您完成了:
use File::BOM qw( open_bom );
my $encoding = open_bom(my $fh_in, $qfn, ':encoding(UTF-8)');
当 Perl 打开一个 UTF-16 编码的文件时,
open my $in, "< :encoding(UTF-16)", "text-utf16le.txt" or die "Error $!\n";
它会自动检测 endianess thanks Byte Order Mark。
但是当我打开文件进行写入时
open my $out, "> :encoding(UTF-16)", "output.txt" or die "Error $!\n";
Perl 默认以 big endian 打开它。
如何指定以与输入文件相同的字节序打开输出文件?
如何从输入文件句柄$in
得到endianness/encoding? PerlIO::get_layers($in)
returns 其他图层 encoding(UTF-16)
.
您必须自己阅读 BOM。
use IO::Unread qw( unread );
open(my $fh_in, "<:raw", $qfn)
or die;
my $rv = read($fh_in, my $buf, 4);
defined($rv)
or die;
my $encoding;
my $bom_present;
if ($buf =~ s/^\x00\x00\xFE\xFF//) { $encoding = 'UTF-32be'; $bom_present = 1; }
elsif ($buf =~ s/^\xFF\xFE\x00\x00//) { $encoding = 'UTF-32le'; $bom_present = 1; }
elsif ($buf =~ s/^\xFE\xFF// ) { $encoding = 'UTF-16be'; $bom_present = 1; }
elsif ($buf =~ s/^\xFF\xFE// ) { $encoding = 'UTF-16le'; $bom_present = 1; }
elsif ($buf =~ s/^\xEF\xBB\xBF// ) { $encoding = 'UTF-8'; $bom_present = 1; }
else {
$encoding = 'UTF-8';
$bom_present = 0;
}
unread($fh_in, $buf) if length($buf);
binmode($fh_in, ":encoding($encoding)");
binmode($fh_in, ":crlf") if $^O eq 'MSWin32';
但是已经有人为您完成了:
use File::BOM qw( open_bom );
my $encoding = open_bom(my $fh_in, $qfn, ':encoding(UTF-8)');