DOMDocument，获取元素中跟在找到的元素之后的文本

Question

我想获取紧跟在文本 ABC 之后的 ul>li 的文本。这种情况下的文本将是 123.

<h2>CDE</h2>
<ul>...</ul>

<h2>ABC</h2>
<ul>
  <li>
    <span>123</span>
  </li>
</ul>

这是我的，但它不起作用

$dom = new DOMDocument();
$dom->loadHTML($html); // $html is the code above

$h2_all = $dom->getElementsByTagName('h2');

foreach($h2_all as $h2) {
  $h2_text = $h2->textContent;

  if (trim(strtolower($h2_text)) == 'abc') {
    var_dump($h2->nextSibling);
  }
}

我想是因为 $h2 不包含我需要的 ul 数据，但我不确定如何获取它。

Answer 1

查看兄弟姐妹，找到第一个ul:

$ul = null;
foreach($dom->getElementsByTagName('h2') as $h2) {
    if(trim(strtolower($h2->textContent)) == "abc") {       
        $obj = $h2->nextSibling;
        while($obj != null) {
            if($obj->nodeName == "ul") {
                $ul = $obj;
                break 2;
            }
            $obj = $obj->nextSibling;
        }
    }
}
//make sure ul has at least one li
if($ul != null && $ul->firstChild != null) {
        echo $ul->firstChild->nodeValue;
}

Answer 2

您可以使用 xpath 查询：

$dom = new DOMDocument;
$dom->loadHTML($html);

$xp = new DOMXPath($dom);

$qry = '//ul[preceding::h2[1] = "ABC"]/li/span';

$result = $xp->query($qry)->item(0)->nodeValue;

查询详情：

//         # the path can start from anywhere in the dom tree
ul
[preceding::h2[1] = "ABC"] # condition: the first preceding h2 has the value "ABC"
/li/span   # lets continue the path until the span node

DOMDocument，获取元素中跟在找到的元素之后的文本

DOMDocument, get the text in an element that follow a found element

php

domdocument