beautifulsoup 未返回跨度结果
beautifulsoup not returning span results
我正在学习 bs4 并尝试从该网站抓取 span 标签数据并将它们放入列表中但没有返回任何结果我做错了什么?
import requests
import bs4
root_url = 'http://www.timeanddate.com'
index_url = root_url + '/astronomy/tonga/nukualofa'
response = requests.get(index_url)
soup = bs4.BeautifulSoup(response.text)
spans = soup.find_all('span', attrs={'id':'qfacts'})
for span in spans:
print span.string
网页的所有span数据都在这个标签之间
<div class="five columns" id="qfacts">
<p><span class="four">Current Time:</span> <span id="smct">16 Mar 2015 at
12:53:50 p.m.</span></p><br>
<p><span class="four">Sunrise Today:</span> <span class="three">6:43
a.m.</span> <span class="comp sa8" title="Map direction East">↑</span> 93°
East<br>
<span class="four">Sunset Today:</span> <span class="three">6:56
p.m.</span> <span class="comp sa24" title="Map direction West">↑</span>
268° West</p><br>
<p><span class="four">Moonrise Today:</span> <span class="three">1:55
a.m.</span> <span class="comp sa10" title="Map direction East">↑</span>
108° East<br>
<span class="four">Moonset Today:</span> <span class="three">3:17
p.m.</span> <span class="comp sa22" title="Map direction West">↑</span>
253° West</p><br>
<p><span class="four">Daylight Hours:</span> <span title=
"The current day is 12 hours, 13 minutes long which is 1m 13s shorter than yesterday.">
12 hours, 13 minutes (-1m 13s)</span></p>
</div>
一个微妙的错误是,您正在搜索 ID 为 "facts" 的跨度标签,而您真正想要的是在 中搜索跨度标签 div 有那个 id。
替换,
spans = soup.find_all('span', attrs={'id':'qfacts'})
与
div = soup.find('div', attrs={'id': 'qfacts'}) # <-- div not span
spans = div.find_all('span') # <-- now find the spans inside
如果您正在寻找具有一些 class 的 div,您可能想要迭代这些 div 并找到其中的所有跨度,但这是一个 id,因此只有一个 [=12] =] 调用就足够了。
我正在学习 bs4 并尝试从该网站抓取 span 标签数据并将它们放入列表中但没有返回任何结果我做错了什么?
import requests
import bs4
root_url = 'http://www.timeanddate.com'
index_url = root_url + '/astronomy/tonga/nukualofa'
response = requests.get(index_url)
soup = bs4.BeautifulSoup(response.text)
spans = soup.find_all('span', attrs={'id':'qfacts'})
for span in spans:
print span.string
网页的所有span数据都在这个标签之间
<div class="five columns" id="qfacts">
<p><span class="four">Current Time:</span> <span id="smct">16 Mar 2015 at
12:53:50 p.m.</span></p><br>
<p><span class="four">Sunrise Today:</span> <span class="three">6:43
a.m.</span> <span class="comp sa8" title="Map direction East">↑</span> 93°
East<br>
<span class="four">Sunset Today:</span> <span class="three">6:56
p.m.</span> <span class="comp sa24" title="Map direction West">↑</span>
268° West</p><br>
<p><span class="four">Moonrise Today:</span> <span class="three">1:55
a.m.</span> <span class="comp sa10" title="Map direction East">↑</span>
108° East<br>
<span class="four">Moonset Today:</span> <span class="three">3:17
p.m.</span> <span class="comp sa22" title="Map direction West">↑</span>
253° West</p><br>
<p><span class="four">Daylight Hours:</span> <span title=
"The current day is 12 hours, 13 minutes long which is 1m 13s shorter than yesterday.">
12 hours, 13 minutes (-1m 13s)</span></p>
</div>
一个微妙的错误是,您正在搜索 ID 为 "facts" 的跨度标签,而您真正想要的是在 中搜索跨度标签 div 有那个 id。
替换,
spans = soup.find_all('span', attrs={'id':'qfacts'})
与
div = soup.find('div', attrs={'id': 'qfacts'}) # <-- div not span
spans = div.find_all('span') # <-- now find the spans inside
如果您正在寻找具有一些 class 的 div,您可能想要迭代这些 div 并找到其中的所有跨度,但这是一个 id,因此只有一个 [=12] =] 调用就足够了。