Xpath从多个标签中获取文本内容
Xpath get text content from multiple tags
我有这个 HTML 模板:
<ul>
<li>
<div>
<span class="field_full"><strong>Title 1</strong></span> :
<span itemprop="alternativeHeadline">
<span itemprop="alternativeHeadline">
DESC 1
</span>
</span></div>
</li>
<li>
<div>
<span class="field_full"><strong>Title 2</strong></span> :
<span itemscope="" itemtype="http://schema.org/type2" itemprop="type2">
<a href="/"><span itemprop="name">DESC 2</span></a>
</span>
</div>
</li>
<li>
<div>
<span class="field_full"><strong> Title 3</strong></span>:
<span itemprop="type3" itemscope="" itemtype="http://schema.org/type3">
<a href="/"><span itemprop="name">DESC 3-1</span></a>, <a href="/"><span itemprop="name">DESC 3-2</span></a>, <a href="/"><span itemprop="name">DESC 3-3</span></a>
</span>
</div>
</li>
<li>
<span class="field_full"><strong>Title 4</strong></span>:
<span> <a href="/">DESC 4</a></span>
</li>
<li>
<span class="field_full"><strong>Title 5</strong></span>:
<span itemprop="type">
<a href="/">DESC 5-1</a>, <a href="/">DESC 5-2</a>
</span>
</li>
<li>
<span class="field_full"><strong>Title 6</strong></span>:
<span itemprop="type">
DESC 6
</span>
</li>
<li>
<span class="field_full"><strong>Title 7</strong></span>:
<span itemprop="type">
DESC 7
</span>
</li>
<li>
<span class="field_full"><strong>Title 8</strong></span>:
<span itemprop="type">
<a href="/">DESC 8</a>
</span>
</li>
</ul>
我想使用 xpath 获得预期结果:
TITLE 1 = DESC 1
TITLE 2 = DESC 2
TITLE 3 = DESC 3-1, DESC 3-2, DESC 3-3
TITLE 4 = DESC 4
TITLE 5 = DESC 5-1, DESC 5-2
TITLE 6 = DESC 6
TITLE 7 = DESC 7
TITLE 8 = DESC 8
我试过什么?
$dom = new DOMDocument();
$dom->loadHTML($html_string);
$xpath = new DOMXpath($dom);
$elements = $xpath->query("//span[@class='field_full']");
foreach($elements as $e) {
echo $e->nodeValue . '<br/>';
}
但不幸的是 return 只有 TITLE 1、TITLE 2、TITLE 3 等
我想获取它们各自的值(在本例中为 DESC 1、DESC 2 等...)。
我可以采取什么方法来实现这个目标?
谢谢
应该使用以下表达式:
//span[@class="field_full"]/following-sibling::span
要获得您想要的确切结果,您可以使用相对 XPath 查询,使用原始 <span>
节点作为根:
$elements = $xpath->query("//span[@class='field_full']");
foreach($elements as $e) {
echo trim($e->nodeValue) . ' = ';
$spans = $xpath->query("following-sibling::span", $e);
foreach ($spans as $span) echo " " . trim($span->nodeValue);
echo "<br/>";
}
输出:
Title 1 = DESC 1<br/>
Title 2 = DESC 2<br/>
Title 3 = DESC 3-1, DESC 3-2, DESC 3-3<br/>
Title 4 = DESC 4<br/>
Title 5 = DESC 5-1, DESC 5-2<br/>
Title 6 = DESC 6<br/>
Title 7 = DESC 7<br/>
Title 8 = DESC 8<br/>
我有这个 HTML 模板:
<ul>
<li>
<div>
<span class="field_full"><strong>Title 1</strong></span> :
<span itemprop="alternativeHeadline">
<span itemprop="alternativeHeadline">
DESC 1
</span>
</span></div>
</li>
<li>
<div>
<span class="field_full"><strong>Title 2</strong></span> :
<span itemscope="" itemtype="http://schema.org/type2" itemprop="type2">
<a href="/"><span itemprop="name">DESC 2</span></a>
</span>
</div>
</li>
<li>
<div>
<span class="field_full"><strong> Title 3</strong></span>:
<span itemprop="type3" itemscope="" itemtype="http://schema.org/type3">
<a href="/"><span itemprop="name">DESC 3-1</span></a>, <a href="/"><span itemprop="name">DESC 3-2</span></a>, <a href="/"><span itemprop="name">DESC 3-3</span></a>
</span>
</div>
</li>
<li>
<span class="field_full"><strong>Title 4</strong></span>:
<span> <a href="/">DESC 4</a></span>
</li>
<li>
<span class="field_full"><strong>Title 5</strong></span>:
<span itemprop="type">
<a href="/">DESC 5-1</a>, <a href="/">DESC 5-2</a>
</span>
</li>
<li>
<span class="field_full"><strong>Title 6</strong></span>:
<span itemprop="type">
DESC 6
</span>
</li>
<li>
<span class="field_full"><strong>Title 7</strong></span>:
<span itemprop="type">
DESC 7
</span>
</li>
<li>
<span class="field_full"><strong>Title 8</strong></span>:
<span itemprop="type">
<a href="/">DESC 8</a>
</span>
</li>
</ul>
我想使用 xpath 获得预期结果:
TITLE 1 = DESC 1
TITLE 2 = DESC 2
TITLE 3 = DESC 3-1, DESC 3-2, DESC 3-3
TITLE 4 = DESC 4
TITLE 5 = DESC 5-1, DESC 5-2
TITLE 6 = DESC 6
TITLE 7 = DESC 7
TITLE 8 = DESC 8
我试过什么?
$dom = new DOMDocument();
$dom->loadHTML($html_string);
$xpath = new DOMXpath($dom);
$elements = $xpath->query("//span[@class='field_full']");
foreach($elements as $e) {
echo $e->nodeValue . '<br/>';
}
但不幸的是 return 只有 TITLE 1、TITLE 2、TITLE 3 等
我想获取它们各自的值(在本例中为 DESC 1、DESC 2 等...)。
我可以采取什么方法来实现这个目标?
谢谢
应该使用以下表达式:
//span[@class="field_full"]/following-sibling::span
要获得您想要的确切结果,您可以使用相对 XPath 查询,使用原始 <span>
节点作为根:
$elements = $xpath->query("//span[@class='field_full']");
foreach($elements as $e) {
echo trim($e->nodeValue) . ' = ';
$spans = $xpath->query("following-sibling::span", $e);
foreach ($spans as $span) echo " " . trim($span->nodeValue);
echo "<br/>";
}
输出:
Title 1 = DESC 1<br/>
Title 2 = DESC 2<br/>
Title 3 = DESC 3-1, DESC 3-2, DESC 3-3<br/>
Title 4 = DESC 4<br/>
Title 5 = DESC 5-1, DESC 5-2<br/>
Title 6 = DESC 6<br/>
Title 7 = DESC 7<br/>
Title 8 = DESC 8<br/>