PHP DOM 如何从 UL 获取项目和子项目
PHP DOM How to get items and sub-items from UL
我正在尝试从以下菜单中获取所有带有锚标记的项目和子项目:
<nav class="header-nav" id="headerLara">
<div class="menu-hauptmenu-container">
<ul id="head_nav_ul" class="menu">
<li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-4">
<a>First Menu</a>
<ul class="sub-menu">
<li class="menu-item menu-item-type-post_type menu-item-object-page menu-item-14002">
<a href="http://example.com/fm1">F menu 1</a>
</li>
<li class="menu-item menu-item-type-post_type menu-item-object-post menu-item-12718">
<a href="http://example.com/fm2">F menu 2</a>
</li>
</ul>
</li>
<li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-6">
<a>Second Menu</a>
<ul class="sub-menu">
<li class="menu-item menu-item-type-post_type menu-item-object-page menu-item-1257">
<a href="http://example.com/sm1">S menu 1</a>
</li>
<li class="menu-item menu-item-type-post_type menu-item-object-page menu-item-5420">
<a href="http://example.com/sm2">S menu 2</a>
</li>
</ul>
</li>
<li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-12821">
<a href="http://example.com/m3">Third Menu</a>
</li>
</ul>
</div>
</nav>
现在我想要像这样的输出:
<nav class="header-nav" id="headerLara">
<div class="menu-hauptmenu-container">
<ul>
<li>
<a class="has-child">First Menu</a>
<ul>
<li>
<a href="http://example.com/fm1">F menu 1</a>
</li>
<li>
<a href="http://example.com/fm2">F menu 2</a>
</li>
</ul>
</li>
<li>
<a class="has-child">Second Menu</a>
<ul>
<li>
<a href="http://example.com/sm1">S menu 1</a>
</li>
<li>
<a href="http://example.com/sm2">S menu 2</a>
</li>
</ul>
</li>
<li>
<a href="http://example.com/m3">Third Menu</a>
</li>
</ul>
</div>
</nav>
我做了一些研发并尝试使用以下 PHP 代码:
<?php
$doc = new DomDocument;
$doc->validateOnParse = true;
$doc->loadHtml(file_get_contents('http://example.com/blabla.php'));
$header = $doc->getElementById('headerLara');
$mainUls = $header->getElementsByTagName('ul');
foreach ($mainUls as $mainUl) {
echo '<ul>';
$mainLis = $mainUl->getElementsByTagName('li');
foreach ($mainLis as $mainLi) {
echo '<li>';
$mainAnc = $mainLi->getElementsByTagName('a');
$href = $mainAnc->item(0)->getAttribute('href');
echo '<a class="has-child" href="'.$href.'">'.$mainAnc->item(0)->nodeValue.'</a>';
$secUls = $mainLi->getElementsByTagName('ul');
if($secUls->length < 2){
foreach ($secUls as $secUl) {
echo '<ul>';
$secLis = $secUl->getElementsByTagName('li');
foreach ($secLis as $secLi) {
echo '<li>';
$secAnc = $mainLi->getElementsByTagName('a');
$shref = $secAnc->item(0)->getAttribute('href');
echo '<a href="'.$shref.'">'.$secAnc->item(0)->nodeValue.'</a>';
echo '</li>';
}
echo '</ul>';
}
}
echo '</li>';
}
echo '</ul>';
}
?>
但这对我不起作用,return 输出如下:
<ul>
<li>
<a class="has-child" href="">First Menu</a>
<ul>
<li>
<a href="">First Menu</a>
</li>
<li>
<a href="">First Menu</a>
</li>
</ul>
</li>
<li>
<a class="has-child" href="http://example.com/fm1">F menu 1</a>
</li>
<li>
<a class="has-child" href="http://example.com/fm2">F menu 2</a>
</li>
<li>
<a class="has-child" href="">Second Menu</a>
<ul>
<li>
<a href="">Second Menu</a>
</li>
<li>
<a href="">Second Menu</a>
</li>
</ul>
</li>
<li>
<a class="has-child" href="http://example.com/sm1">S menu 1</a>
</li>
<li>
<a class="has-child" href="http://example.com/sm2">S menu 2</a>
</li>
</ul>
我检查了很多看起来与我的问题相似的链接,但没有发现任何帮助。
如何获得正确的输出,在此先感谢。
有一些小错误(从错误的节点拾取),但有两个主要问题。
首先是getElementsByTagName()
选择all个带有该标签名的子元素,不限于直接子节点,所以每次都会多出标签超出您的预期。在此代码中,它使用 XPath
,因为 DOMDocument
没有方便的方法来执行 只是称为 的直接子节点,因此 XPath 仅使用上下文节点作为你的起点和类似 a
的东西只说 <a>
标签,它们是上下文节点的直接后代。
另一个(主要)是您正在使用 echo
语句构建输出。这可能有效,但也容易出现拼写错误、无效结构等。此代码使用 DOM API 调用来创建文档。
$doc = new DomDocument;
$doc->validateOnParse = true;
$doc->loadHtml($html);
$xp = new DOMXPath($doc);
$header = $doc->getElementById('headerLara');
$mainUls = $xp->query('div/ul', $header);
foreach ($mainUls as $mainUl) {
$mainULE = $doc->createElement("ul");
$mainLis = $xp->query('li', $mainUl);
foreach ($mainLis as $mainLi) {
$li = $doc->createElement("li");
$mainAnc = $xp->query('a', $mainLi)[0];
$href = $mainAnc->getAttribute('href');
$a = $doc->createElement("a", htmlspecialchars($mainAnc->nodeValue));
$href = $mainAnc->getAttribute('href');
if ( !empty($href) ) {
$a->setAttribute("href", $href);
}
$li->appendChild($a);
$secUls = $xp->query('ul', $mainLi);
if($secUls->length < 2){
foreach ($secUls as $secUl) {
$a->setAttribute("class", "has-child");
$secULE = $doc->createElement("ul");
$secLis = $xp->query('li', $secUl);
foreach ($secLis as $secLi) {
$secLIE = $doc->createElement("li");
$secAnc = $xp->query('a', $secLi);
$shref = $secAnc[0]->getAttribute('href');
$secA = $doc->createElement("a", htmlspecialchars($secAnc[0]->nodeValue));
$secA->setAttribute("href", $shref);
$secLIE->appendChild($secA);
$secULE->appendChild($secLIE);
}
$li->appendChild($secULE);
}
}
$mainULE->appendChild($li);
}
echo PHP_EOL.PHP_EOL.">>>>".$doc->saveHTML($mainULE);
// Next line replaces existing HTML
//$mainUl->parentNode->replaceChild($mainULE,$mainUl);
}
我正在尝试从以下菜单中获取所有带有锚标记的项目和子项目:
<nav class="header-nav" id="headerLara">
<div class="menu-hauptmenu-container">
<ul id="head_nav_ul" class="menu">
<li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-4">
<a>First Menu</a>
<ul class="sub-menu">
<li class="menu-item menu-item-type-post_type menu-item-object-page menu-item-14002">
<a href="http://example.com/fm1">F menu 1</a>
</li>
<li class="menu-item menu-item-type-post_type menu-item-object-post menu-item-12718">
<a href="http://example.com/fm2">F menu 2</a>
</li>
</ul>
</li>
<li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-6">
<a>Second Menu</a>
<ul class="sub-menu">
<li class="menu-item menu-item-type-post_type menu-item-object-page menu-item-1257">
<a href="http://example.com/sm1">S menu 1</a>
</li>
<li class="menu-item menu-item-type-post_type menu-item-object-page menu-item-5420">
<a href="http://example.com/sm2">S menu 2</a>
</li>
</ul>
</li>
<li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-12821">
<a href="http://example.com/m3">Third Menu</a>
</li>
</ul>
</div>
</nav>
现在我想要像这样的输出:
<nav class="header-nav" id="headerLara">
<div class="menu-hauptmenu-container">
<ul>
<li>
<a class="has-child">First Menu</a>
<ul>
<li>
<a href="http://example.com/fm1">F menu 1</a>
</li>
<li>
<a href="http://example.com/fm2">F menu 2</a>
</li>
</ul>
</li>
<li>
<a class="has-child">Second Menu</a>
<ul>
<li>
<a href="http://example.com/sm1">S menu 1</a>
</li>
<li>
<a href="http://example.com/sm2">S menu 2</a>
</li>
</ul>
</li>
<li>
<a href="http://example.com/m3">Third Menu</a>
</li>
</ul>
</div>
</nav>
我做了一些研发并尝试使用以下 PHP 代码:
<?php
$doc = new DomDocument;
$doc->validateOnParse = true;
$doc->loadHtml(file_get_contents('http://example.com/blabla.php'));
$header = $doc->getElementById('headerLara');
$mainUls = $header->getElementsByTagName('ul');
foreach ($mainUls as $mainUl) {
echo '<ul>';
$mainLis = $mainUl->getElementsByTagName('li');
foreach ($mainLis as $mainLi) {
echo '<li>';
$mainAnc = $mainLi->getElementsByTagName('a');
$href = $mainAnc->item(0)->getAttribute('href');
echo '<a class="has-child" href="'.$href.'">'.$mainAnc->item(0)->nodeValue.'</a>';
$secUls = $mainLi->getElementsByTagName('ul');
if($secUls->length < 2){
foreach ($secUls as $secUl) {
echo '<ul>';
$secLis = $secUl->getElementsByTagName('li');
foreach ($secLis as $secLi) {
echo '<li>';
$secAnc = $mainLi->getElementsByTagName('a');
$shref = $secAnc->item(0)->getAttribute('href');
echo '<a href="'.$shref.'">'.$secAnc->item(0)->nodeValue.'</a>';
echo '</li>';
}
echo '</ul>';
}
}
echo '</li>';
}
echo '</ul>';
}
?>
但这对我不起作用,return 输出如下:
<ul>
<li>
<a class="has-child" href="">First Menu</a>
<ul>
<li>
<a href="">First Menu</a>
</li>
<li>
<a href="">First Menu</a>
</li>
</ul>
</li>
<li>
<a class="has-child" href="http://example.com/fm1">F menu 1</a>
</li>
<li>
<a class="has-child" href="http://example.com/fm2">F menu 2</a>
</li>
<li>
<a class="has-child" href="">Second Menu</a>
<ul>
<li>
<a href="">Second Menu</a>
</li>
<li>
<a href="">Second Menu</a>
</li>
</ul>
</li>
<li>
<a class="has-child" href="http://example.com/sm1">S menu 1</a>
</li>
<li>
<a class="has-child" href="http://example.com/sm2">S menu 2</a>
</li>
</ul>
我检查了很多看起来与我的问题相似的链接,但没有发现任何帮助。
如何获得正确的输出,在此先感谢。
有一些小错误(从错误的节点拾取),但有两个主要问题。
首先是getElementsByTagName()
选择all个带有该标签名的子元素,不限于直接子节点,所以每次都会多出标签超出您的预期。在此代码中,它使用 XPath
,因为 DOMDocument
没有方便的方法来执行 只是称为 的直接子节点,因此 XPath 仅使用上下文节点作为你的起点和类似 a
的东西只说 <a>
标签,它们是上下文节点的直接后代。
另一个(主要)是您正在使用 echo
语句构建输出。这可能有效,但也容易出现拼写错误、无效结构等。此代码使用 DOM API 调用来创建文档。
$doc = new DomDocument;
$doc->validateOnParse = true;
$doc->loadHtml($html);
$xp = new DOMXPath($doc);
$header = $doc->getElementById('headerLara');
$mainUls = $xp->query('div/ul', $header);
foreach ($mainUls as $mainUl) {
$mainULE = $doc->createElement("ul");
$mainLis = $xp->query('li', $mainUl);
foreach ($mainLis as $mainLi) {
$li = $doc->createElement("li");
$mainAnc = $xp->query('a', $mainLi)[0];
$href = $mainAnc->getAttribute('href');
$a = $doc->createElement("a", htmlspecialchars($mainAnc->nodeValue));
$href = $mainAnc->getAttribute('href');
if ( !empty($href) ) {
$a->setAttribute("href", $href);
}
$li->appendChild($a);
$secUls = $xp->query('ul', $mainLi);
if($secUls->length < 2){
foreach ($secUls as $secUl) {
$a->setAttribute("class", "has-child");
$secULE = $doc->createElement("ul");
$secLis = $xp->query('li', $secUl);
foreach ($secLis as $secLi) {
$secLIE = $doc->createElement("li");
$secAnc = $xp->query('a', $secLi);
$shref = $secAnc[0]->getAttribute('href');
$secA = $doc->createElement("a", htmlspecialchars($secAnc[0]->nodeValue));
$secA->setAttribute("href", $shref);
$secLIE->appendChild($secA);
$secULE->appendChild($secLIE);
}
$li->appendChild($secULE);
}
}
$mainULE->appendChild($li);
}
echo PHP_EOL.PHP_EOL.">>>>".$doc->saveHTML($mainULE);
// Next line replaces existing HTML
//$mainUl->parentNode->replaceChild($mainULE,$mainUl);
}