DOM loadhtml 提取节点和 child 节点

DOM loadhtml extract nodes and child nodes

我有一个项目列表,我需要在其中获取列表标题属性、link URL 和显示的 link 文本,以及每个项目的跨度值列表标签。

<ul>
<li class="testclass" title="Title 1 goes here">
<a href="http://examplelink1.com">List Text 1</a>
<span>Second List Text 1</span>
</li>
<li class="testclass" title="Title 2 goes here">
<a href="http://examplelink2.com">List Text 2</a>
<span>Second List Text 2</span>
</li>
</ul>

如何使用 foreach 提取每个单独的列表标签及其值(因为之后我需要将值插入 MySQL 数据库(每个值在不同的数据库字段中)。

到目前为止,我只能单独获取它们:

<?php
$doc = new DOMDocument();
@$doc->loadHTML($list);
$imageTags = $doc->getElementsByTagName('a'); 
foreach($imageTags as $tag) {
$link = $tag->getAttribute('href');
echo $link.'<br/>';
}
?>

<?php
$doc = new DOMDocument();
@$doc->loadHTML($list);
$imageTags = $doc->getElementsByTagName('li'); 
foreach($imageTags as $tag) {
$link = $tag->getAttribute('title');
echo $link.'<br/>';
}
?>

我找到了一个带有 xpath 的脚本,但我不知道如何正确应用它来获取我需要的特定值并在 MySQL 语句中使用它们:

<?php
$dom = new DOMdocument();
@$dom->loadHTML($list);
$xpath = new DOMXPath($dom);
$elements = $xpath->query("//*");
foreach ($elements as $element) {
echo "<p>". $element->nodeName. "</p>";
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->nodeValue. "<br/>";
}
}
?>

使用DOMXPath::evaluate()。它是 ext/dom 的一部分,允许您使用 XPath 表达式从 DOM 中获取节点和值。

$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);

// use an xpath expression to fetch the li nodes
foreach ($xpath->evaluate('//ul/li[@class="testclass"]') as $li) {
  var_dump(
    [
      // this is a direct attribute of the li node, use dom method
      'title' => $li->getAttribute('title'),
      // more complex, use an xpath expression
      'href' => $xpath->evaluate('string(a/@href)', $li),
      // Cast the node to a string to return the text content
      'link-text' => $xpath->evaluate('string(a)', $li),
      // works for the span, too
      'description' => $xpath->evaluate('string(span)', $li)
    ]
  );
}

输出:

array(4) {
  ["title"]=>
  string(17) "Title 1 goes here"
  ["href"]=>
  string(23) "http://examplelink1.com"
  ["link-text"]=>
  string(11) "List Text 1"
  ["description"]=>
  string(18) "Second List Text 1"
}
array(4) {
  ["title"]=>
  string(17) "Title 2 goes here"
  ["href"]=>
  string(23) "http://examplelink2.com"
  ["link-text"]=>
  string(11) "List Text 2"
  ["description"]=>
  string(18) "Second List Text 2"
}