PHP DOMDocument:如何解析带有冒号的自定义 XML/RSS 标签名称?
PHP DOMDocument : How to parse custom XML/RSS tag names with COLONS?
我要解析以下 RSS,类似于:
<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
<channel>
<item>
<title>About Apples</title>
<author>David K. Lowie</title>
<description>Some description about apples</description>
<xCal:description>This is the full description about apples</xCal:description>
</item>
<item>
<title>About Oranges</title>
<author>Marry L. Jones</title>
<description>Some description about oranges</description>
<xCal:description>This is the full description about oranges</xCal:description>
</item>
</channel>
</rss>
在 PHP 中,我将其解析为:
$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
foreach( $rss->getElementsByTagName("item") as $node ) {
echo $node->getElementsByTagName("title")->item(0)->nodeValue,
echo $node->getElementsByTagName("author")->item(0)->nodeValue,
echo $node->getElementsByTagName("description")->item(0)->nodeValue,
echo $node->getElementsByTagName("xCal:description")->item(0)->nodeValue,
}
除了 那里的 xCal:description
节点,我可以阅读 所有内容。 (节点名称完全一样:description
和 xCal:description
。)
- 如何解析(读取)这样的节点
xCal:description
- 是否因为相似的节点名称,例如:
description
和 xCal:description
?
(我无法更改 RSS 源,因为它不在我的控制之下。)
请帮忙。
$node->getElementsByTagNameNS("urn:ietf:params:xml:ns:xcal", "description")->item(0)->nodeValue
虽然使用 DOM 方法的命名空间感知变体是一个正确的答案,但您可能想看看 Xpath。从 DOM.
中获取数据是一种更舒适的方式
对于Xpath表达式,您可以根据需要为命名空间注册自己的前缀。
$rss = new DOMDocument();
$rss->load("http://www.example.com/books.rss");
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('xc', 'urn:ietf:params:xml:ns:xcal');
foreach($xpath->evaluate("//item") as $item) {
echo $xpath->evaluate('string(title)', $item), "\n";
echo $xpath->evaluate('string(author)', $item), "\n";
echo $xpath->evaluate('string(description)', $item), "\n";
echo $xpath->evaluate('string(xc:description)', $item), "\n";
}
输出:
About Apples
David K. Lowie
Some description about apples
This is the full description about apples
About Oranges
Marry L. Jones
Some description about oranges
This is the full description about oranges
我要解析以下 RSS,类似于:
<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
<channel>
<item>
<title>About Apples</title>
<author>David K. Lowie</title>
<description>Some description about apples</description>
<xCal:description>This is the full description about apples</xCal:description>
</item>
<item>
<title>About Oranges</title>
<author>Marry L. Jones</title>
<description>Some description about oranges</description>
<xCal:description>This is the full description about oranges</xCal:description>
</item>
</channel>
</rss>
在 PHP 中,我将其解析为:
$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
foreach( $rss->getElementsByTagName("item") as $node ) {
echo $node->getElementsByTagName("title")->item(0)->nodeValue,
echo $node->getElementsByTagName("author")->item(0)->nodeValue,
echo $node->getElementsByTagName("description")->item(0)->nodeValue,
echo $node->getElementsByTagName("xCal:description")->item(0)->nodeValue,
}
除了 那里的 xCal:description
节点,我可以阅读 所有内容。 (节点名称完全一样:description
和 xCal:description
。)
- 如何解析(读取)这样的节点
xCal:description
- 是否因为相似的节点名称,例如:
description
和xCal:description
?
(我无法更改 RSS 源,因为它不在我的控制之下。)
请帮忙。
$node->getElementsByTagNameNS("urn:ietf:params:xml:ns:xcal", "description")->item(0)->nodeValue
虽然使用 DOM 方法的命名空间感知变体是一个正确的答案,但您可能想看看 Xpath。从 DOM.
中获取数据是一种更舒适的方式对于Xpath表达式,您可以根据需要为命名空间注册自己的前缀。
$rss = new DOMDocument();
$rss->load("http://www.example.com/books.rss");
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('xc', 'urn:ietf:params:xml:ns:xcal');
foreach($xpath->evaluate("//item") as $item) {
echo $xpath->evaluate('string(title)', $item), "\n";
echo $xpath->evaluate('string(author)', $item), "\n";
echo $xpath->evaluate('string(description)', $item), "\n";
echo $xpath->evaluate('string(xc:description)', $item), "\n";
}
输出:
About Apples
David K. Lowie
Some description about apples
This is the full description about apples
About Oranges
Marry L. Jones
Some description about oranges
This is the full description about oranges