您如何使用 XPath select 某些元素但排除具有相同部分 class 的其他元素？

Question

使用通用的 XPath（或使用 python 中 lxml 的特定功能），您如何 select 一组具有一组标签的元素？

<div class="cl1 a">
<div class="cl1 b">

但不是

<div class="cl1">

Answer 1

您可以使用 XPath //div[starts-with(@class,"cl1 ")]；注意 cl1 之后的 space。例如，

In [20]: import lxml.html as LH
In [21]: doc = LH.parse('data.html')
In [24]: doc.xpath('//div[starts-with(@class,"cl1 ")]')
Out[24]: [<Element div at 0x7f0568c68100>, <Element div at 0x7f0568c68158>]

In [25]: [LH.tostring(elt) for elt in doc.xpath('//div[starts-with(@class,"cl1 ")]')]
Out[25]: ['<div class="cl1 a"></div>\n', '<div class="cl1 b"></div>\n']

您如何使用 XPath select 某些元素但排除具有相同部分 class 的其他元素？

How do you select some elements but exclude others with the same partial class with XPath?

xpath

lxml