将编码标签保留在 XML::Twig 中
Keep encoded tag in XML::Twig
我想使用 XML::Twig
.
修改大型 XML 文件
使用处理程序回调时,XML::Twig
似乎会更改编码为 HTML 实体的字符,例如大于号 (>
-- >
)。
示例脚本:
my $input = q~
<root>
<p><encoded tag></p>
</root>
~;
my $t = XML::Twig->new(
keep_spaces => 1,
twig_roots => { 'p' => \&convert, }, # process p tags
twig_print_outside_roots => 1, # print the rest
);
$t->parse($input);
sub convert {
my ($t, $p)= @_;
$p->set_att('x' => 'y');
$p->print;
}
这会将文档变成以下内容:
<root>
<p x="y"><encoded tag></p>
</root>
我期待得到这个:
<root>
<p x="y"><encoded tag></p>
</root>
如何使用 XML::Twig
保留标签的编码内容?
您需要在构造函数中设置keep_encoding
选项,如下所示,或者在构造对象后调用$twig->set_keep_encoding($option)
修改它
请注意 the module documentation 是这样说的
This is a (slightly?) evil option: if the XML document is not UTF-8 encoded and you want to keep it that way, then setting keep_encoding will use the "Expat" original_string
method for character, thus keeping the original encoding, as well as the original entities in the strings.
但是就在这里,按照你的要求去做。风险自负
use strict;
use warnings 'all';
use XML::Twig;
my $input = <<END_XML;
<root>
<p><encoded tag></p>
</root>
END_XML
my $t = XML::Twig->new(
keep_spaces => 1,
keep_encoding => 1,
twig_roots => { p => \&convert }, # process p elements
twig_print_outside_roots => 1, # print the rest
);
$t->parse($input);
sub convert {
my ($t, $p) = @_;
$p->print;
}
输出
<root>
<p><encoded tag></p>
</root>
我想使用 XML::Twig
.
使用处理程序回调时,XML::Twig
似乎会更改编码为 HTML 实体的字符,例如大于号 (>
-- >
)。
示例脚本:
my $input = q~
<root>
<p><encoded tag></p>
</root>
~;
my $t = XML::Twig->new(
keep_spaces => 1,
twig_roots => { 'p' => \&convert, }, # process p tags
twig_print_outside_roots => 1, # print the rest
);
$t->parse($input);
sub convert {
my ($t, $p)= @_;
$p->set_att('x' => 'y');
$p->print;
}
这会将文档变成以下内容:
<root>
<p x="y"><encoded tag></p>
</root>
我期待得到这个:
<root>
<p x="y"><encoded tag></p>
</root>
如何使用 XML::Twig
保留标签的编码内容?
您需要在构造函数中设置keep_encoding
选项,如下所示,或者在构造对象后调用$twig->set_keep_encoding($option)
修改它
请注意 the module documentation 是这样说的
This is a (slightly?) evil option: if the XML document is not UTF-8 encoded and you want to keep it that way, then setting keep_encoding will use the "Expat"
original_string
method for character, thus keeping the original encoding, as well as the original entities in the strings.
但是就在这里,按照你的要求去做。风险自负
use strict;
use warnings 'all';
use XML::Twig;
my $input = <<END_XML;
<root>
<p><encoded tag></p>
</root>
END_XML
my $t = XML::Twig->new(
keep_spaces => 1,
keep_encoding => 1,
twig_roots => { p => \&convert }, # process p elements
twig_print_outside_roots => 1, # print the rest
);
$t->parse($input);
sub convert {
my ($t, $p) = @_;
$p->print;
}
输出
<root>
<p><encoded tag></p>
</root>