我们如何用分隔符分隔从 XML::LibXMl 的 findvalues 中获取的值?
How do we separate values fetched from findvalues of XML::LibXMl by a delimiter?
我有一个 XML 需要解析。虽然我能够获取它们,但我无法通过分隔符将它们分开以进行进一步处理。请指教。我的代码如下
use XML::LibXML;
my $filename = 'Test.xml';
my $parser = XML::LibXML->new();
my $dom = $parser->parse_file($filename);
my $root = $dom->documentElement();
my $xpc = XML::LibXML::XPathContext->new($root);
foreach my $id ($xpc->findnodes('/dataset/chapter'))
{
print $xpc->findvalue('mono/route-list', $id);
print join ",", $xpc->findvalue('mono/route-list', $id);
}
对于两个 "print" 语句,我得到了相同的结果,尽管预期的结果是:
眼科口服局部鼻腔注射口服口服口服
眼科、口服、局部、鼻腔、注射、口服、口服、口服、口服
xml文件结构如下:
<dataset id="5"><title>NDC 11</title>
<chapter id="9"><title>NDC 11</title>
<mono id="310694" mid="145787">
<nam>00173074200</nam>
<route-list>
<list-set-field dbId="25413">
<name>ophthalmic</name>
</list-set-field>
</route-list>
</mono>
<mono id="4128683" mid="536890">
<nam>51079020406</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="4128743" mid="536930">
<nam>65862007360</nam>
<route-list>
<list-set-field dbId="25413">
<name>topical</name>
</list-set-field></route-list>
</mono>
<mono id="3419599" mid="469070">
<nam>49702021718</nam>
<route-list>
<list-set-field dbId="25413">
<name>nasal</name>
</list-set-field>
</route-list>
</mono>
<mono id="2990346" mid="440470">
<nam>49702022118</nam>
<route-list>
<list-set-field dbId="25413">
<name>injection</name>
</list-set-field>
</route-list>
</mono>
<mono id="2990347" mid="440470">
<nam>49702022144</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="2990357" mid="440491">
<nam>49702022248</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="3808911" mid="513570">
<nam>00378410591</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="4128724" mid="536910">
<nam>60505358306</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
</chapter>
</dataset>
如果您尝试此代码(注意 for 循环中的最后一行):
use strict;
use warnings;
use 5.016;
use XML::LibXML;
my $filename = 'Test.xml';
my $dom = XML::LibXML->load_xml(
location => $filename,
);
my $xpc = XML::LibXML::XPathContext->new($dom);
CHAPTER:
for my $chapter ($xpc->findnodes('/dataset/chapter')) {
my $string = $xpc->findvalue('mono/route-list', $chapter);
print $string;
last CHAPTER; #<*****NOTE THIS
}
您将得到输出:
ophthalmic
oral
topical
nasal
injection
oral
oral
oral
oral
文档说:
findvalue()
...returns the literal value of the results.
results 多于 one 结果。而one结果是all和text之间的一个匹配标签。
xml每行末尾有一个隐藏字符:
<route-list>\n
<list-set-field dbId="25413">\n
<name>ophthalmic</name>\n
</list-set-field>\n
</route-list>\n
...以及每行开头的几个 spaces/tabs。 spaces/tabs 和换行符被视为文本,它们位于 <route_list>
标记之间。结果,one 结果的文本也包含所有 spaces/tabs/换行符。
和 findvalue() returns 将所有结果中的文本作为一个字符串。您 可以 使用正则表达式拆分该字符串以获得各个值;但与其为自己创造更多工作,不如这样做:
CHAPTER:
for my $chapter ($xpc->findnodes('/dataset/chapter')) {
for my $name ($xpc->findnodes('//mono/route-list//name', $chapter)) {
say $name->textContent;
last CHAPTER;
}
}
--output:--
ophthalmic
...甚至这个:
CHAPTER:
for my $chapter ($xpc->findnodes('/dataset/chapter')) {
for my $name_text ($xpc->findnodes('//mono/route-list//name/text()', $chapter)) {
say $name_text;
last CHAPTER;
}
}
我有一个 XML 需要解析。虽然我能够获取它们,但我无法通过分隔符将它们分开以进行进一步处理。请指教。我的代码如下
use XML::LibXML;
my $filename = 'Test.xml';
my $parser = XML::LibXML->new();
my $dom = $parser->parse_file($filename);
my $root = $dom->documentElement();
my $xpc = XML::LibXML::XPathContext->new($root);
foreach my $id ($xpc->findnodes('/dataset/chapter'))
{
print $xpc->findvalue('mono/route-list', $id);
print join ",", $xpc->findvalue('mono/route-list', $id);
}
对于两个 "print" 语句,我得到了相同的结果,尽管预期的结果是:
眼科口服局部鼻腔注射口服口服口服
眼科、口服、局部、鼻腔、注射、口服、口服、口服、口服
xml文件结构如下:
<dataset id="5"><title>NDC 11</title>
<chapter id="9"><title>NDC 11</title>
<mono id="310694" mid="145787">
<nam>00173074200</nam>
<route-list>
<list-set-field dbId="25413">
<name>ophthalmic</name>
</list-set-field>
</route-list>
</mono>
<mono id="4128683" mid="536890">
<nam>51079020406</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="4128743" mid="536930">
<nam>65862007360</nam>
<route-list>
<list-set-field dbId="25413">
<name>topical</name>
</list-set-field></route-list>
</mono>
<mono id="3419599" mid="469070">
<nam>49702021718</nam>
<route-list>
<list-set-field dbId="25413">
<name>nasal</name>
</list-set-field>
</route-list>
</mono>
<mono id="2990346" mid="440470">
<nam>49702022118</nam>
<route-list>
<list-set-field dbId="25413">
<name>injection</name>
</list-set-field>
</route-list>
</mono>
<mono id="2990347" mid="440470">
<nam>49702022144</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="2990357" mid="440491">
<nam>49702022248</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="3808911" mid="513570">
<nam>00378410591</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
<mono id="4128724" mid="536910">
<nam>60505358306</nam>
<route-list>
<list-set-field dbId="25413">
<name>oral</name>
</list-set-field>
</route-list>
</mono>
</chapter>
</dataset>
如果您尝试此代码(注意 for 循环中的最后一行):
use strict;
use warnings;
use 5.016;
use XML::LibXML;
my $filename = 'Test.xml';
my $dom = XML::LibXML->load_xml(
location => $filename,
);
my $xpc = XML::LibXML::XPathContext->new($dom);
CHAPTER:
for my $chapter ($xpc->findnodes('/dataset/chapter')) {
my $string = $xpc->findvalue('mono/route-list', $chapter);
print $string;
last CHAPTER; #<*****NOTE THIS
}
您将得到输出:
ophthalmic
oral
topical
nasal
injection
oral
oral
oral
oral
文档说:
findvalue()
...returns the literal value of the results.
results 多于 one 结果。而one结果是all和text之间的一个匹配标签。
xml每行末尾有一个隐藏字符:
<route-list>\n
<list-set-field dbId="25413">\n
<name>ophthalmic</name>\n
</list-set-field>\n
</route-list>\n
...以及每行开头的几个 spaces/tabs。 spaces/tabs 和换行符被视为文本,它们位于 <route_list>
标记之间。结果,one 结果的文本也包含所有 spaces/tabs/换行符。
和 findvalue() returns 将所有结果中的文本作为一个字符串。您 可以 使用正则表达式拆分该字符串以获得各个值;但与其为自己创造更多工作,不如这样做:
CHAPTER:
for my $chapter ($xpc->findnodes('/dataset/chapter')) {
for my $name ($xpc->findnodes('//mono/route-list//name', $chapter)) {
say $name->textContent;
last CHAPTER;
}
}
--output:--
ophthalmic
...甚至这个:
CHAPTER:
for my $chapter ($xpc->findnodes('/dataset/chapter')) {
for my $name_text ($xpc->findnodes('//mono/route-list//name/text()', $chapter)) {
say $name_text;
last CHAPTER;
}
}