XML 解析 Xquery,在 XML 树中包含 HTML table

XML parsing Xquery, that contains HTML table inside XML tree

我有以下 xml,来自 URL。

<rss xmlns:dc="http://purl.org/dc/elements/1.1/"  xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>Videos</title>
<link>https://www.example.com/r/videos/</link>
<description>A long description of the video.</description>
<image>...</image>
<atom:link rel="self" href="http://www.example.com/videos/.xml" type="application/rss+xml"/>
<item>
    <title>The most used Jazz lick in history.</title>
    <link>
    http://www.example.com/
    </link>
    <guid isPermaLink="true">
     http://www.example.com/
    </guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
    <tr>
        <td>
            <a href="http://www.example.com/">
                <img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
            </a>
        </td>
        <td> submitted by 
            <a href="http://www.example.com/"> jcepiano </a>
            <br/>
            <a href="http://www.youtube.com/">[link]</a>
            <a href="http://www.example.com/">
                [508 comments]
            </a>
        </td>
    </tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
    <title>The most used Jazz lick in history.</title>
    <link>
    http://www.example.com/
    </link>
    <guid isPermaLink="true">
     http://www.example.com/
    </guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
    <tr>
        <td>
            <a href="http://www.example.com/">
                <img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
            </a>
        </td>
        <td> submitted by 
            <a href="http://www.example.com/"> jcepiano </a>
            <br/>
            <a href="http://www.youtube.com/">[link]</a>
            <a href="http://www.example.com/">
                [508 comments]
            </a>
        </td>
    </tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
    <title>The most used Jazz lick in history.</title>
    <link>
    http://www.example.com/
    </link>
    <guid isPermaLink="true">
     http://www.example.com/
    </guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
    <tr>
        <td>
            <a href="http://www.example.com/">
                <img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
            </a>
        </td>
        <td> submitted by 
            <a href="http://www.example.com/"> jcepiano </a>
            <br/>
            <a href="http://www.youtube.com/">[link]</a>
            <a href="http://www.example.com/">
                [508 comments]
            </a>
        </td>
    </tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
    <title>The most used Jazz lick in history.</title>
    <link>
    http://www.example.com/
    </link>
    <guid isPermaLink="true">
     http://www.example.com/
    </guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
    <tr>
        <td>
            <a href="http://www.example.com/">
                <img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
            </a>
        </td>
        <td> submitted by 
            <a href="http://www.example.com/"> jcepiano </a>
            <br/>
            <a href="http://www.youtube.com/">[link]</a>
            <a href="http://www.example.com/">
                [508 comments]
            </a>
        </td>
    </tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
</channel>
</rss>

我想在每个itemahref下用每个[=21]下的nodeValue = "[link]"回显title的每个节点值=]下同item.

你能写出完整的代码来实现同样的目的吗?然后我将学习每一段代码的作用。

[我正在从性能角度寻找代码。]

我尝试用 DomDocument、loadXML 和 loadHTML 来做,但没有成功。

下面是我的代码:

$url = "https://www.example.com/r/videos/.xml";
$dom = new domDocument; 
$dom->load($url); 
$dom->preserveWhiteSpace = false;
$items = $dom->getElementsByTagName('item');
$descs = $dom->getElementsByTagName('description');
foreach($items as $item){
    $title = $item->getElementsByTagName('title')->item(0)->nodeValue;
    echo $title . "<br>"; //This is echoing well

    foreach($item->getElementsByTagName('description') as $desc){

            $domH = new domDocument();
            $domH->loadHTML((string)$desc)); // here I get the error, mentioned below

            $td = $domH->getElementsByTagName('td')->item(1);
            $anchors = $td->getElementsByTagName('a')->item(1);

            echo $anchors->item(0)->getAttribute('href');
        }
}

我收到错误: Catchable fatal error: Object of class DOMElement could not be converted to string in /home/thanksbelieve/public_html/vsi/trend_vids.php on line 16

我想我需要一种将对象转换为字符串的方法,然后它应该可以正常工作,我还尝试在执行 loadHTML((string)$desc)) 之前在第二个 foreach 循环中执行 saveHTML() 但是运气不好。

我没有在网上找到简单易学的教程。 任何帮助将不胜感激。

谢谢:)

我终于可以用下面的代码让它工作了

<?php    
$url = "https://www.example.com/r/videos/.xml";
$feed_dom = new domDocument; 
$feed_dom->load($url); 
$feed_dom->preserveWhiteSpace = false;
$items = $feed_dom->getElementsByTagName('item');

foreach($items as $item){
    $title = $item->getElementsByTagName('title')->item(0)->nodeValue;
    $desc_table = $item->getElementsByTagName('description')->item(0)->nodeValue;
    echo $title . "<br>";

    $table_dom = new domDocument;
    $table_dom->loadHTML($desc_table);
    $xpath = new DOMXpath($table_dom);
    $table_dom->preserveWhiteSpace = false;
    $yt_link_node = $xpath->query("//table/tr/td[2]/a[2]");

    foreach($yt_link_node as $yt_link){

        $yt = $yt_link->getAttribute('href');
        echo $yt . "<br>";
        echo "<br>";
    }
}
?>

感谢 Abel,您的评论对实现代码非常有帮助! :)