perl text::iconv 不支持的转换
perl text::iconv unsupported conversion
在 perl 中,我阅读了 html 页,并通过 text::iconv 转换为 utf8。但是当某些页面定义了错误的代码集时,例如:charset="blabla",那么 perl 程序就会死掉并打印输出 "unsupported conversion"。
我试图将 Text::Iconv->raise_error 设置为 0 或 1 但没有成功,程序总是死掉。
如何避免程序崩溃?
或者如何在转换前检查支持的代码集?
(我知道 "iconv --list" 在 OS 中读过它,但必须存在更好的解决方案(希望))
How to avoid program crash ?
perl 使用 eval
来捕获错误:
use strict;
use warnings;
use 5.016;
use Text::Iconv;
my $source_encoding = 'blabla';
my $result_encoding = 'utf-8';
my $converter = eval {
Text::Iconv->new(
$source_encoding,
$result_encoding
);
}; #Error message gets inserted into $@
if (not $converter and $@ =~ /invalid argument/i) {
say "Either the '$source_encoding' encoding or the ",
"'$result_encoding' encoding\nis not available on this system.";
}
if ($converter) { #Can new() fail in other ways?
my $result = $converter->convert('€');
if (not $result) {
say "Some characters in '$source_encoding'\n",
"are invalid in '$result_encoding'.";
}
else {
say $result;
}
}
在 [block] 形式中,BLOCK 中的代码只被解析一次——同时解析 eval 本身周围的代码——并在当前 Perl 程序的上下文中执行。这种形式通常用于比第一种形式更有效地捕获异常(见下文),同时还提供了在编译时检查 BLOCK 内代码的好处。
http://perldoc.perl.org/functions/eval.html
OR how to check supported code set before conversion? (I know read it
in OS by "iconv --list", but must exist better solution (hope))
iconv --list
有什么不好的?
use strict;
use warnings;
use 5.016;
use Text::Iconv;
my $source_encoding = 'blabla';
my $result_encoding = 'utf-8';
my $available_encodings = `iconv --list`; #Backticks return a string.
my @encodings_arr = split /\s+/, $available_encodings;
my %encodings_set = map {lc $_ => undef} @encodings_arr;
my $source_encoding_available = exists $encodings_set{$source_encoding};
my $result_encoding_available = exists $encodings_set{$result_encoding};
if($source_encoding_available
and $result_encoding_available) {
say "Ready to convert";
}
else {
if (not $source_encoding_available) {
say "'$source_encoding' encoding not available.";
}
if (not $result_encoding_available) {
say "'$result_encoding' encoding not available.";
}
}
在 perl 中,我阅读了 html 页,并通过 text::iconv 转换为 utf8。但是当某些页面定义了错误的代码集时,例如:charset="blabla",那么 perl 程序就会死掉并打印输出 "unsupported conversion"。 我试图将 Text::Iconv->raise_error 设置为 0 或 1 但没有成功,程序总是死掉。
如何避免程序崩溃? 或者如何在转换前检查支持的代码集? (我知道 "iconv --list" 在 OS 中读过它,但必须存在更好的解决方案(希望))
How to avoid program crash ?
perl 使用 eval
来捕获错误:
use strict;
use warnings;
use 5.016;
use Text::Iconv;
my $source_encoding = 'blabla';
my $result_encoding = 'utf-8';
my $converter = eval {
Text::Iconv->new(
$source_encoding,
$result_encoding
);
}; #Error message gets inserted into $@
if (not $converter and $@ =~ /invalid argument/i) {
say "Either the '$source_encoding' encoding or the ",
"'$result_encoding' encoding\nis not available on this system.";
}
if ($converter) { #Can new() fail in other ways?
my $result = $converter->convert('€');
if (not $result) {
say "Some characters in '$source_encoding'\n",
"are invalid in '$result_encoding'.";
}
else {
say $result;
}
}
在 [block] 形式中,BLOCK 中的代码只被解析一次——同时解析 eval 本身周围的代码——并在当前 Perl 程序的上下文中执行。这种形式通常用于比第一种形式更有效地捕获异常(见下文),同时还提供了在编译时检查 BLOCK 内代码的好处。
http://perldoc.perl.org/functions/eval.html
OR how to check supported code set before conversion? (I know read it in OS by "iconv --list", but must exist better solution (hope))
iconv --list
有什么不好的?
use strict;
use warnings;
use 5.016;
use Text::Iconv;
my $source_encoding = 'blabla';
my $result_encoding = 'utf-8';
my $available_encodings = `iconv --list`; #Backticks return a string.
my @encodings_arr = split /\s+/, $available_encodings;
my %encodings_set = map {lc $_ => undef} @encodings_arr;
my $source_encoding_available = exists $encodings_set{$source_encoding};
my $result_encoding_available = exists $encodings_set{$result_encoding};
if($source_encoding_available
and $result_encoding_available) {
say "Ready to convert";
}
else {
if (not $source_encoding_available) {
say "'$source_encoding' encoding not available.";
}
if (not $result_encoding_available) {
say "'$result_encoding' encoding not available.";
}
}