如何使用beautifulsoup获取html中的class内容?
how to obtain class contents in html using beautifulsoup?
这是我的html代码,我希望在上面工作:
<section id='price'>
<div class="row">
<h4 class='col-sm-4'>Market Cap: <b><i class="fa fa-inr"></i> 10.64 Crores</b></h4>
<h4 class='col-sm-4'>Current Price: <b><i class="fa fa-inr"></i> 35.35</b></h4>
<h4 class='col-sm-4'>Book Value: <b><i class="fa fa-inr"></i> 53.52</b></h4>
</div>
我的问题是如何从"class='col-sm-4'".
中获取市值、当前价格、账面价值
因为如果我尝试:
print soup.row.col-sm-4.fa.fa-inr
它不起作用。我对 python 和网络抓取有点陌生所以请耐心地完成整个过程。提前致谢。
您可以通过文本查找标签并获得 next_element
:
from bs4 import BeautifulSoup
data = """
<div class="row">
<h4 class='col-sm-4'>Market Cap: <b><i class="fa fa-inr"></i> 10.64 Crores</b></h4>
<h4 class='col-sm-4'>Current Price: <b><i class="fa fa-inr"></i> 35.35</b></h4>
<h4 class='col-sm-4'>Book Value: <b><i class="fa fa-inr"></i> 53.52</b></h4>
</div>
"""
soup = BeautifulSoup(data)
titles = ['Market Cap', 'Current Price', 'Book Value']
for title in titles:
print soup.find(text=lambda x: x.startswith(title)).next_element.text
打印:
10.64 Crores
35.35
53.52
要获取浮点值,您可以简单地除以space并获取第一个元素:
price = soup.find(text=lambda x: x.startswith(title)).strip().split()[0]
print float(price)
您也可以通过 CSS Selector:
获取它们
for item in soup.select('section#price div.row h4.col-sm-4 b'):
print item.text
这样试试:
>>> for x in soup.find_all("div","row"):
... print x.text
...
Market Cap: 10.64 Crores
Current Price: 35.35
Book Value: 53.52
这是我的html代码,我希望在上面工作:
<section id='price'>
<div class="row">
<h4 class='col-sm-4'>Market Cap: <b><i class="fa fa-inr"></i> 10.64 Crores</b></h4>
<h4 class='col-sm-4'>Current Price: <b><i class="fa fa-inr"></i> 35.35</b></h4>
<h4 class='col-sm-4'>Book Value: <b><i class="fa fa-inr"></i> 53.52</b></h4>
</div>
我的问题是如何从"class='col-sm-4'".
中获取市值、当前价格、账面价值因为如果我尝试:
print soup.row.col-sm-4.fa.fa-inr
它不起作用。我对 python 和网络抓取有点陌生所以请耐心地完成整个过程。提前致谢。
您可以通过文本查找标签并获得 next_element
:
from bs4 import BeautifulSoup
data = """
<div class="row">
<h4 class='col-sm-4'>Market Cap: <b><i class="fa fa-inr"></i> 10.64 Crores</b></h4>
<h4 class='col-sm-4'>Current Price: <b><i class="fa fa-inr"></i> 35.35</b></h4>
<h4 class='col-sm-4'>Book Value: <b><i class="fa fa-inr"></i> 53.52</b></h4>
</div>
"""
soup = BeautifulSoup(data)
titles = ['Market Cap', 'Current Price', 'Book Value']
for title in titles:
print soup.find(text=lambda x: x.startswith(title)).next_element.text
打印:
10.64 Crores
35.35
53.52
要获取浮点值,您可以简单地除以space并获取第一个元素:
price = soup.find(text=lambda x: x.startswith(title)).strip().split()[0]
print float(price)
您也可以通过 CSS Selector:
获取它们for item in soup.select('section#price div.row h4.col-sm-4 b'):
print item.text
这样试试:
>>> for x in soup.find_all("div","row"):
... print x.text
...
Market Cap: 10.64 Crores
Current Price: 35.35
Book Value: 53.52