使用 Term::ReadLine 和 Unicode 输入
Using Term::ReadLine with Unicode input
我正在尝试弄清楚如何使用 Term::ReadLine
. It turns out, if I enter a Unicode character at the prompt, the returned string varies depending on various settings. (I am running Ubuntu 14.10, and have installed Term::ReadLine::Gnu
) 从终端读取 Unicode 输入。例如 (p.pl
):
use open qw( :std :utf8 );
use strict;
use warnings;
use Devel::Peek;
use Term::ReadLine;
my $term = Term::ReadLine->new('ProgramName');
$term->ornaments( 0 );
my $ans = $term->readline("Enter message: ");
Dump ( $ans );
运行 p.pl
并在提示符下输入 å
会得到输出:
Enter message: å
SV = PV(0x83a5a0) at 0x87c080
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x917500 "35"[=12=]
CUR = 2
LEN = 10
所以返回的字符串$ans
没有设置UTF-8
标志。但是,如果我 运行 程序使用 perl -CS p.pl
,输出是:
Enter message: å
SV = PVMG(0x24c12e0) at 0x23050a0
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
IV = 0
NV = 0
PV = 0x248faf0 "35"[=13=] [UTF8 "\x{e5}"]
CUR = 2
LEN = 10
UTF-8 标志已在 $ans
上正确设置。所以第一个问题是:为什么命令行选项 -CS
不同于使用 pragma use open qw( :std :utf8 )
?
接下来,我用 -CS
选项测试了 Term::ReadLine::Stub
:
$ PERL_RL=Stub perl -CS p.pl
现在的输出是:
Enter message: å
SV = PV(0xf97260) at 0xfd90c8
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x10746e0 "3325"[=15=] [UTF8 "\x{c3}\x{a5}"]
CUR = 4
LEN = 10
并且输出字符串 $ans
已被双重编码,因此输出已损坏。这是一个错误,还是预期的行为?
Term::ReadLine
不读取 STDIN
,它 opens new 文件句柄。所以 use open qw(:std :utf8);
没有效果。
你需要做这样的事情:
my $term = Term::ReadLine->new('name');
binmode($term->IN, ':utf8');
关于-CS
的更新:
选项 -C
为魔术变量 ${^UNICODE}
设置了一些值。 -CS
(或 -CI
)选项使表达式 ${^UNICODE} & 0x0001
为真。如果 ${^UNICODE} & 0x0001
为真,则输入字符串 Term::ReadLine
sets UTF-8 flag on。
注意,选项 -CS
与 binmode($term->IN, ':utf8')
不同。其中第一个仅设置 UTF-8 标志,第二个编码字符串。
正如 Denis Ibaev 在他的 , the problem is that Term::ReadLine
does not read STDIN
, it opens a new input filehandle. As an alternative to calling binmode($term->IN, ':utf8')
, it turns out one can make either of command line option -CS
or use open qw( :std :utf8)
work out of the box with Term::ReadLine
by supplying STDIN
as an argument to Term::ReadLine->new()
, as explained in the answer to this question: Term::Readline: encoding-question 中所解释的那样。
例如:
use strict;
use utf8;
use open qw( :std :utf8 );
use warnings;
use Term::ReadLine;
my $term = Term::ReadLine->new('Test', \*STDIN, \*STDOUT);
my $answer = $term->readline( 'Enter input: ' );
我正在尝试弄清楚如何使用 Term::ReadLine
. It turns out, if I enter a Unicode character at the prompt, the returned string varies depending on various settings. (I am running Ubuntu 14.10, and have installed Term::ReadLine::Gnu
) 从终端读取 Unicode 输入。例如 (p.pl
):
use open qw( :std :utf8 );
use strict;
use warnings;
use Devel::Peek;
use Term::ReadLine;
my $term = Term::ReadLine->new('ProgramName');
$term->ornaments( 0 );
my $ans = $term->readline("Enter message: ");
Dump ( $ans );
运行 p.pl
并在提示符下输入 å
会得到输出:
Enter message: å
SV = PV(0x83a5a0) at 0x87c080
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x917500 "35"[=12=]
CUR = 2
LEN = 10
所以返回的字符串$ans
没有设置UTF-8
标志。但是,如果我 运行 程序使用 perl -CS p.pl
,输出是:
Enter message: å
SV = PVMG(0x24c12e0) at 0x23050a0
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
IV = 0
NV = 0
PV = 0x248faf0 "35"[=13=] [UTF8 "\x{e5}"]
CUR = 2
LEN = 10
UTF-8 标志已在 $ans
上正确设置。所以第一个问题是:为什么命令行选项 -CS
不同于使用 pragma use open qw( :std :utf8 )
?
接下来,我用 -CS
选项测试了 Term::ReadLine::Stub
:
$ PERL_RL=Stub perl -CS p.pl
现在的输出是:
Enter message: å
SV = PV(0xf97260) at 0xfd90c8
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x10746e0 "3325"[=15=] [UTF8 "\x{c3}\x{a5}"]
CUR = 4
LEN = 10
并且输出字符串 $ans
已被双重编码,因此输出已损坏。这是一个错误,还是预期的行为?
Term::ReadLine
不读取 STDIN
,它 opens new 文件句柄。所以 use open qw(:std :utf8);
没有效果。
你需要做这样的事情:
my $term = Term::ReadLine->new('name');
binmode($term->IN, ':utf8');
关于-CS
的更新:
选项 -C
为魔术变量 ${^UNICODE}
设置了一些值。 -CS
(或 -CI
)选项使表达式 ${^UNICODE} & 0x0001
为真。如果 ${^UNICODE} & 0x0001
为真,则输入字符串 Term::ReadLine
sets UTF-8 flag on。
注意,选项 -CS
与 binmode($term->IN, ':utf8')
不同。其中第一个仅设置 UTF-8 标志,第二个编码字符串。
正如 Denis Ibaev 在他的 Term::ReadLine
does not read STDIN
, it opens a new input filehandle. As an alternative to calling binmode($term->IN, ':utf8')
, it turns out one can make either of command line option -CS
or use open qw( :std :utf8)
work out of the box with Term::ReadLine
by supplying STDIN
as an argument to Term::ReadLine->new()
, as explained in the answer to this question: Term::Readline: encoding-question 中所解释的那样。
例如:
use strict;
use utf8;
use open qw( :std :utf8 );
use warnings;
use Term::ReadLine;
my $term = Term::ReadLine->new('Test', \*STDIN, \*STDOUT);
my $answer = $term->readline( 'Enter input: ' );