XML 解析 Xquery,在 XML 树中包含 HTML table
XML parsing Xquery, that contains HTML table inside XML tree
我有以下 xml,来自 URL。
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>Videos</title>
<link>https://www.example.com/r/videos/</link>
<description>A long description of the video.</description>
<image>...</image>
<atom:link rel="self" href="http://www.example.com/videos/.xml" type="application/rss+xml"/>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
</channel>
</rss>
我想在每个item
和a
的href
下用每个[=21]下的nodeValue = "[link]"
回显title
的每个节点值=]下同item
.
你能写出完整的代码来实现同样的目的吗?然后我将学习每一段代码的作用。
[我正在从性能角度寻找代码。]
我尝试用 DomDocument、loadXML 和 loadHTML 来做,但没有成功。
下面是我的代码:
$url = "https://www.example.com/r/videos/.xml";
$dom = new domDocument;
$dom->load($url);
$dom->preserveWhiteSpace = false;
$items = $dom->getElementsByTagName('item');
$descs = $dom->getElementsByTagName('description');
foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
echo $title . "<br>"; //This is echoing well
foreach($item->getElementsByTagName('description') as $desc){
$domH = new domDocument();
$domH->loadHTML((string)$desc)); // here I get the error, mentioned below
$td = $domH->getElementsByTagName('td')->item(1);
$anchors = $td->getElementsByTagName('a')->item(1);
echo $anchors->item(0)->getAttribute('href');
}
}
我收到错误:
Catchable fatal error: Object of class DOMElement could not be converted to string in /home/thanksbelieve/public_html/vsi/trend_vids.php on line 16
我想我需要一种将对象转换为字符串的方法,然后它应该可以正常工作,我还尝试在执行 loadHTML((string)$desc))
之前在第二个 foreach
循环中执行 saveHTML()
但是运气不好。
我没有在网上找到简单易学的教程。
任何帮助将不胜感激。
谢谢:)
我终于可以用下面的代码让它工作了
<?php
$url = "https://www.example.com/r/videos/.xml";
$feed_dom = new domDocument;
$feed_dom->load($url);
$feed_dom->preserveWhiteSpace = false;
$items = $feed_dom->getElementsByTagName('item');
foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
$desc_table = $item->getElementsByTagName('description')->item(0)->nodeValue;
echo $title . "<br>";
$table_dom = new domDocument;
$table_dom->loadHTML($desc_table);
$xpath = new DOMXpath($table_dom);
$table_dom->preserveWhiteSpace = false;
$yt_link_node = $xpath->query("//table/tr/td[2]/a[2]");
foreach($yt_link_node as $yt_link){
$yt = $yt_link->getAttribute('href');
echo $yt . "<br>";
echo "<br>";
}
}
?>
感谢 Abel,您的评论对实现代码非常有帮助! :)
我有以下 xml,来自 URL。
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>Videos</title>
<link>https://www.example.com/r/videos/</link>
<description>A long description of the video.</description>
<image>...</image>
<atom:link rel="self" href="http://www.example.com/videos/.xml" type="application/rss+xml"/>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
</channel>
</rss>
我想在每个item
和a
的href
下用每个[=21]下的nodeValue = "[link]"
回显title
的每个节点值=]下同item
.
你能写出完整的代码来实现同样的目的吗?然后我将学习每一段代码的作用。
[我正在从性能角度寻找代码。]
我尝试用 DomDocument、loadXML 和 loadHTML 来做,但没有成功。
下面是我的代码:
$url = "https://www.example.com/r/videos/.xml";
$dom = new domDocument;
$dom->load($url);
$dom->preserveWhiteSpace = false;
$items = $dom->getElementsByTagName('item');
$descs = $dom->getElementsByTagName('description');
foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
echo $title . "<br>"; //This is echoing well
foreach($item->getElementsByTagName('description') as $desc){
$domH = new domDocument();
$domH->loadHTML((string)$desc)); // here I get the error, mentioned below
$td = $domH->getElementsByTagName('td')->item(1);
$anchors = $td->getElementsByTagName('a')->item(1);
echo $anchors->item(0)->getAttribute('href');
}
}
我收到错误:
Catchable fatal error: Object of class DOMElement could not be converted to string in /home/thanksbelieve/public_html/vsi/trend_vids.php on line 16
我想我需要一种将对象转换为字符串的方法,然后它应该可以正常工作,我还尝试在执行 loadHTML((string)$desc))
之前在第二个 foreach
循环中执行 saveHTML()
但是运气不好。
我没有在网上找到简单易学的教程。 任何帮助将不胜感激。
谢谢:)
我终于可以用下面的代码让它工作了
<?php
$url = "https://www.example.com/r/videos/.xml";
$feed_dom = new domDocument;
$feed_dom->load($url);
$feed_dom->preserveWhiteSpace = false;
$items = $feed_dom->getElementsByTagName('item');
foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
$desc_table = $item->getElementsByTagName('description')->item(0)->nodeValue;
echo $title . "<br>";
$table_dom = new domDocument;
$table_dom->loadHTML($desc_table);
$xpath = new DOMXpath($table_dom);
$table_dom->preserveWhiteSpace = false;
$yt_link_node = $xpath->query("//table/tr/td[2]/a[2]");
foreach($yt_link_node as $yt_link){
$yt = $yt_link->getAttribute('href');
echo $yt . "<br>";
echo "<br>";
}
}
?>
感谢 Abel,您的评论对实现代码非常有帮助! :)