导航 XML 以使用 XML::TWIG 访问 CDATA
Navigating XML to access CDATA using XML::TWIG
我有这个 XML 文件,我需要一次访问一个特定节点。下面是我的 XML 示例以及我的示例代码。
除了我循环遍历所有 Message/Content 标记而不是仅获取当前消息标记下的特定 Message/Content 标记外,我的代码工作正常。例如,当我只想返回 1 个 () 时,我会在处理当前消息标签时返回 3 个 Message/Content 标签(带有 refid="123991123" 的标签)。希望这是有道理的。如有任何帮助,我们将不胜感激。
代码:
my $twig = XML::Twig->new(
twig_handlers => {
Selection => sub {
foreach my $message ($_->findnodes('./Contents/Message')) {
if($message->att('custom')){
$Message_custom = $message->att('custom');
foreach my $Content ($_->findnodes('./Contents/Message/Content')) {
print $Selection_id.": ".$Message_refid.": ".$TotalContents++."\n";
if($Content->att('language') eq "en"){
if($Content->att('imagelibraryid')){
$Message_Content_language_en_imagelibraryid = $Content->att('imagelibraryid');
}else{
$Message_Content_language_en = substr($message->field('Content'), 0, 20);
}
}
}
}
}
},
}
);
XML:
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<Selection id="54008473">
<Name>Master</Name>
<Contents>
<Message refid="125796458" suppress="true" status="Unchanged"/>
<Message refid="123991123" suppress="true" status="Unchanged">
<Content language="en" imagelibraryid="5492396"/>
</Message>
<Message refid="128054778" custom="true" status="New">
<Content language="en"><![CDATA[<p>Some English content</p>]]></Content>
<Content language="fr"><![CDATA[<p>Some French content</p>]]></Content>
</Message>
</Contents>
</Selection>
<Selection id="54008475" datavaluerefid="54008479">
<Name>RMBC</Name>
<Contents>
<Message refid="125796458" sameasparent="true" parentrefid="54008473" status="Unchanged"/>
<Message refid="123991123" sameasparent="true" parentrefid="54008473" status="Unchanged"/>
<Message refid="128054778" custom="true" status="New">
<Content language="en"><![CDATA[<p>ada</p>]]></Content>
</Message>
</Contents>
</Selection>
</Root>
这是第一次尝试根据 XML:
的结构理解您的代码应该做什么
- handler for
Selection
nodes 在 Content
nodes 下的 Message
nodes 下寻找具有属性 language == 'en'
的 children Content
nodes
- 转换为 XPath
./Contents/Message/Content[@language='en']
- 如果它有一个属性
imagelibraryid
,存储那个 的值
- 否则存储第一个child
的CDATA
内容
- 将
refid
设置为来自parentMessage
节点的属性值
- 将它们附加到
Selection
节点的内容列表中
- 要显示收集到的内容,请在数组 ref
上使用 Data::Dumper
#!/usr/bin/perl
use warnings;
use strict;
use XML::Twig;
use Data::Dumper;
my %selections;
my $twig = XML::Twig->new(
twig_handlers => {
Selection => sub {
#$_->print();
print "selection id: ", $_->att('id'), "\n";
my @contents;
foreach my $content ($_->findnodes("./Contents/Message/Content[\@language='en']")) {
my $result = {
refid => $content->parent->att('refid'),
};
my $id = $content->att('imagelibraryid');
if (defined $id) {
$result->{library} = $id;
} else {
$result->{cata} = $content->first_child->cdata;
}
push(@contents, $result);
}
# store collected Content nodes under selection ID
$selections{ $_->att('id') } = \@contents;
},
}
);
$twig->parse(\*DATA);
while (my($id, $contents) = each %selections) {
my $dump = Dumper($contents);
print "Selection '${id}' messages: $dump\n";
}
exit 0;
__DATA__
<?xml version="1.0" encoding="UTF-8"?>
... the rest of your XML left out ...
测试运行:
$ perl dummy.pl
selection id: 54008473
selection id: 54008475
Selection '54008473' messages: $VAR1 = [
{
'refid' => '123991123',
'library' => '5492396'
},
{
'cata' => '<p>Some English content</p>',
'refid' => '128054778'
}
];
Selection '54008475' messages: $VAR1 = [
{
'cata' => '<p>ada</p>',
'refid' => '128054778'
}
];
我有这个 XML 文件,我需要一次访问一个特定节点。下面是我的 XML 示例以及我的示例代码。
除了我循环遍历所有 Message/Content 标记而不是仅获取当前消息标记下的特定 Message/Content 标记外,我的代码工作正常。例如,当我只想返回 1 个 () 时,我会在处理当前消息标签时返回 3 个 Message/Content 标签(带有 refid="123991123" 的标签)。希望这是有道理的。如有任何帮助,我们将不胜感激。
代码:
my $twig = XML::Twig->new(
twig_handlers => {
Selection => sub {
foreach my $message ($_->findnodes('./Contents/Message')) {
if($message->att('custom')){
$Message_custom = $message->att('custom');
foreach my $Content ($_->findnodes('./Contents/Message/Content')) {
print $Selection_id.": ".$Message_refid.": ".$TotalContents++."\n";
if($Content->att('language') eq "en"){
if($Content->att('imagelibraryid')){
$Message_Content_language_en_imagelibraryid = $Content->att('imagelibraryid');
}else{
$Message_Content_language_en = substr($message->field('Content'), 0, 20);
}
}
}
}
}
},
}
);
XML:
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<Selection id="54008473">
<Name>Master</Name>
<Contents>
<Message refid="125796458" suppress="true" status="Unchanged"/>
<Message refid="123991123" suppress="true" status="Unchanged">
<Content language="en" imagelibraryid="5492396"/>
</Message>
<Message refid="128054778" custom="true" status="New">
<Content language="en"><![CDATA[<p>Some English content</p>]]></Content>
<Content language="fr"><![CDATA[<p>Some French content</p>]]></Content>
</Message>
</Contents>
</Selection>
<Selection id="54008475" datavaluerefid="54008479">
<Name>RMBC</Name>
<Contents>
<Message refid="125796458" sameasparent="true" parentrefid="54008473" status="Unchanged"/>
<Message refid="123991123" sameasparent="true" parentrefid="54008473" status="Unchanged"/>
<Message refid="128054778" custom="true" status="New">
<Content language="en"><![CDATA[<p>ada</p>]]></Content>
</Message>
</Contents>
</Selection>
</Root>
这是第一次尝试根据 XML:
的结构理解您的代码应该做什么- handler for
Selection
nodes 在Content
nodes 下的Message
nodes 下寻找具有属性language == 'en'
的 childrenContent
nodes- 转换为 XPath
./Contents/Message/Content[@language='en']
- 如果它有一个属性
imagelibraryid
,存储那个 的值
- 否则存储第一个child 的
- 将
refid
设置为来自parentMessage
节点的属性值
CDATA
内容 - 转换为 XPath
- 将它们附加到
Selection
节点的内容列表中 - 要显示收集到的内容,请在数组 ref 上使用 Data::Dumper
#!/usr/bin/perl
use warnings;
use strict;
use XML::Twig;
use Data::Dumper;
my %selections;
my $twig = XML::Twig->new(
twig_handlers => {
Selection => sub {
#$_->print();
print "selection id: ", $_->att('id'), "\n";
my @contents;
foreach my $content ($_->findnodes("./Contents/Message/Content[\@language='en']")) {
my $result = {
refid => $content->parent->att('refid'),
};
my $id = $content->att('imagelibraryid');
if (defined $id) {
$result->{library} = $id;
} else {
$result->{cata} = $content->first_child->cdata;
}
push(@contents, $result);
}
# store collected Content nodes under selection ID
$selections{ $_->att('id') } = \@contents;
},
}
);
$twig->parse(\*DATA);
while (my($id, $contents) = each %selections) {
my $dump = Dumper($contents);
print "Selection '${id}' messages: $dump\n";
}
exit 0;
__DATA__
<?xml version="1.0" encoding="UTF-8"?>
... the rest of your XML left out ...
测试运行:
$ perl dummy.pl
selection id: 54008473
selection id: 54008475
Selection '54008473' messages: $VAR1 = [
{
'refid' => '123991123',
'library' => '5492396'
},
{
'cata' => '<p>Some English content</p>',
'refid' => '128054778'
}
];
Selection '54008475' messages: $VAR1 = [
{
'cata' => '<p>ada</p>',
'refid' => '128054778'
}
];