following-sibling 可以得到很多属性吗?
Can following-sibling get many attributes?
我有这样的结构。
<li class="Title">This</li>
<li><a href="">AAA</a></li>
<li><a href="">BBB</a></li>
<li><a href="">CCC</a></li>
<li class="Title">That</li>
<li><a href="">DDD</a></li>
<li><a href="">EEE</a></li>
这是我的 xpath:
sites = sel.xpath("//li[@class='Title']")
for i,site in enumerate(sites):
print i
state = site.xpath("./text()")
city = site.xpath("./following-sibling::li/a/text()")
结果是
0
This
AAA
1
That
DDD
但是我想要也想要select所有兄弟姐妹不只有一个
我怎样才能select li
<li class="Title">
下的所有兄弟姐妹
喜欢:
This
AAA
This
BBB
This
CCC
That
DDD
That
EEE
试试这个:
import lxml.etree as etree
string = '''
<root>
<li class="Title">This</li>
<li><a href="">AAA</a></li>
<li><a href="">BBB</a></li>
<li><a href="">CCC</a></li>
<li class="Title">That</li>
<li><a href="">DDD</a></li>
<li><a href="">EEE</a></li>
</root>
'''
st = ", "
tree = etree.fromstring(string)
for i, node in enumerate(tree.xpath('//li[@class="Title"] | //li/a')):
seq = (str(i), node.text, node.attrib.keys()[0])
print st.join(seq)
输出:
0, This, class
1, AAA, href
2, BBB, href
3, CCC, href
4, That, class
5, DDD, href
6, EEE, href
注意:
现在,你已经足够启动li的类型来分支你想要的,但是要小心,没有li子元素尽管 POST.
中原始缩进的含义
作为替代方案(要仅检查 元素之后的兄弟姐妹,您可以遍历兄弟姐妹并在到达另一个 元素时突破.
像这样:
import lxml
# I wrap your sample with an empty div
s = '''<div><li class="Title">This</li>
<li><a href="">AAA</a></li>
<li><a href="">BBB</a></li>
<li><a href="">CCC</a></li>
<li class="Title">That</li>
<li><a href="">DDD</a></li>
<li><a href="">EEE</a></li></div>'''
tree = lxml.etree.fromstring(s)
# search for all <li> with "Title" element
for node in tree.xpath('.//li[@class="Title"]'):
print '\n'
# loop in <li class="Title"> to find for any siblings with <a> element
for sub in node.xpath('.//following-sibling::li'):
# break out the loop if another <li class="Title"> is found
# you can implement other logic to break out as well
if sub.get('class') == 'Title':
break
print node.text
print ''.join(sub.xpath('./a/text()'))
结果:
This
AAA
This
BBB
This
CCC
That
DDD
That
EEE
我有这样的结构。
<li class="Title">This</li>
<li><a href="">AAA</a></li>
<li><a href="">BBB</a></li>
<li><a href="">CCC</a></li>
<li class="Title">That</li>
<li><a href="">DDD</a></li>
<li><a href="">EEE</a></li>
这是我的 xpath:
sites = sel.xpath("//li[@class='Title']")
for i,site in enumerate(sites):
print i
state = site.xpath("./text()")
city = site.xpath("./following-sibling::li/a/text()")
结果是
0
This
AAA
1
That
DDD
但是我想要也想要select所有兄弟姐妹不只有一个
我怎样才能select li
<li class="Title">
喜欢:
This
AAA
This
BBB
This
CCC
That
DDD
That
EEE
试试这个:
import lxml.etree as etree
string = '''
<root>
<li class="Title">This</li>
<li><a href="">AAA</a></li>
<li><a href="">BBB</a></li>
<li><a href="">CCC</a></li>
<li class="Title">That</li>
<li><a href="">DDD</a></li>
<li><a href="">EEE</a></li>
</root>
'''
st = ", "
tree = etree.fromstring(string)
for i, node in enumerate(tree.xpath('//li[@class="Title"] | //li/a')):
seq = (str(i), node.text, node.attrib.keys()[0])
print st.join(seq)
输出:
0, This, class
1, AAA, href
2, BBB, href
3, CCC, href
4, That, class
5, DDD, href
6, EEE, href
注意:
现在,你已经足够启动li的类型来分支你想要的,但是要小心,没有li子元素尽管 POST.
中原始缩进的含义作为替代方案(要仅检查 元素之后的兄弟姐妹,您可以遍历兄弟姐妹并在到达另一个 元素时突破. 像这样:
import lxml
# I wrap your sample with an empty div
s = '''<div><li class="Title">This</li>
<li><a href="">AAA</a></li>
<li><a href="">BBB</a></li>
<li><a href="">CCC</a></li>
<li class="Title">That</li>
<li><a href="">DDD</a></li>
<li><a href="">EEE</a></li></div>'''
tree = lxml.etree.fromstring(s)
# search for all <li> with "Title" element
for node in tree.xpath('.//li[@class="Title"]'):
print '\n'
# loop in <li class="Title"> to find for any siblings with <a> element
for sub in node.xpath('.//following-sibling::li'):
# break out the loop if another <li class="Title"> is found
# you can implement other logic to break out as well
if sub.get('class') == 'Title':
break
print node.text
print ''.join(sub.xpath('./a/text()'))
结果:
This
AAA
This
BBB
This
CCC
That
DDD
That
EEE