在美汤中寻找下一个 div 标签
Find next div tag in beautiful soup
python
中关于美汤的问题
我有一个HTML喜欢
<div class="content">Somedata</div>
<div class="content">Somedata</div>
<div class="content">Qualification</div>
<div class="content">THE DATA I WANT</div>
<div class="content">Somedata</div>
<div class="content">Somedata</div>
相同的 div 标签再次重复
在这种情况下:
没有 I'd 或任何唯一标签,全部包含 --- 仅 div 个标签---
我如何获得资格后的“我想要的数据”文本
提前致谢
txt = '''
<div class="content">Somedata</div>
<div class="content">Somedata</div>
<div class="content">Qualification</div>
<div class="content">THE DATA I WANT</div>
<div class="content">Somedata</div>
<div class="content">Somedata</div>'''
soup = BeautifulSoup(txt, 'html.parser')
print(soup.select_one('div:contains("Qualification") ~ div').text)
打印:
THE DATA I WANT
或:
print(soup.find(text="Qualification").find_next().text)
或:
print(soup.find(lambda t: t.find_previous() and t.find_previous().text == 'Qualification').text)
编辑:要遍历 <div>
s,您可以使用简单的 for 循环:
for item in souped.find_all(lambda t: t.name == 'div' and t.text == 'Qualification'):
print(item.find_next().text)
你可以试试看:
from bs4 import BeautifulSoup
html_doc ='''<div class="content">Somedata</div>
<div class="content">Somedata</div>
<div class="content">Qualification</div>
<div class="content">THE DATA I WANT</div>
<div class="content">Somedata</div>
<div class="content">Somedata</div>'''
soup = BeautifulSoup(html_doc, 'lxml')
result = soup.find_all("div", class_="content")[3].text
print(result)
输出将是:
THE DATA I WANT
或
import re
soup = BeautifulSoup(html_doc, 'lxml')
print(soup.find(text=re.compile('^THE DATA I WANT$')))
或
print(soup.find(string="Qualification").find_next().text)
python
中关于美汤的问题我有一个HTML喜欢
<div class="content">Somedata</div>
<div class="content">Somedata</div>
<div class="content">Qualification</div>
<div class="content">THE DATA I WANT</div>
<div class="content">Somedata</div>
<div class="content">Somedata</div>
相同的 div 标签再次重复
在这种情况下: 没有 I'd 或任何唯一标签,全部包含 --- 仅 div 个标签---
我如何获得资格后的“我想要的数据”文本 提前致谢
txt = '''
<div class="content">Somedata</div>
<div class="content">Somedata</div>
<div class="content">Qualification</div>
<div class="content">THE DATA I WANT</div>
<div class="content">Somedata</div>
<div class="content">Somedata</div>'''
soup = BeautifulSoup(txt, 'html.parser')
print(soup.select_one('div:contains("Qualification") ~ div').text)
打印:
THE DATA I WANT
或:
print(soup.find(text="Qualification").find_next().text)
或:
print(soup.find(lambda t: t.find_previous() and t.find_previous().text == 'Qualification').text)
编辑:要遍历 <div>
s,您可以使用简单的 for 循环:
for item in souped.find_all(lambda t: t.name == 'div' and t.text == 'Qualification'):
print(item.find_next().text)
你可以试试看:
from bs4 import BeautifulSoup
html_doc ='''<div class="content">Somedata</div>
<div class="content">Somedata</div>
<div class="content">Qualification</div>
<div class="content">THE DATA I WANT</div>
<div class="content">Somedata</div>
<div class="content">Somedata</div>'''
soup = BeautifulSoup(html_doc, 'lxml')
result = soup.find_all("div", class_="content")[3].text
print(result)
输出将是:
THE DATA I WANT
或
import re
soup = BeautifulSoup(html_doc, 'lxml')
print(soup.find(text=re.compile('^THE DATA I WANT$')))
或
print(soup.find(string="Qualification").find_next().text)