美丽的汤不返回任何东西

Question

您好，我正在尝试使用 Beautiful Soup 从网站上抓取网页并打印事实。这是网站 https://fungenerators.com/random/facts/animal/weasel。我正在尝试通过网络抓取事实，尽管它最终总是打印 [] 知道我的代码有什么问题吗？？

from urllib.request import urlopen
from bs4 import BeautifulSoup

scrape = "https://fungenerators.com/random/facts/animal/weasel"

request_page = urlopen(scrape)
page_html = request_page.read()
request_page.close()

html_soup = BeautifulSoup(page_html, 'html.parser')

fact = html_soup.find_all('div', class_="wow fadeInUp animated animated")

print(fact)

Answer 1

改用我的代码！！！

import requests
from bs4 import BeautifulSoup

response = requests.get('https://fungenerators.com/random/facts/animal/weasel')

soup = BeautifulSoup(response.content, 'html.parser')

result = soup.select('div.wow.fadeInUp.animated.animated')

print(result[0].text)

结果将是：

Random Weasel  Fact

或者，如果您不想使用 css 选择器，那么您可以这样做：

import requests
from bs4 import BeautifulSoup

response = requests.get('https://fungenerators.com/random/facts/animal/weasel')

soup = BeautifulSoup(response.content, 'html.parser')

result = soup.find_all('h2', class_="wow fadeInUp animated")

print(result[0].text)

Answer 2

您的代码有两个问题：

您想要的元素在 h2 标签下，而不是 div.
由于某些数据是动态加载的，class-名称发生了变化，并删除了第二次出现的“动画”一词。 class-name 不是 wow fadeInUp animated animated，而是 wow fadeInUp animated.

参见以下示例：

from urllib.request import urlopen
from bs4 import BeautifulSoup

scrape = "https://fungenerators.com/random/facts/animal/weasel"

request_page = urlopen(scrape)
page_html = request_page.read()
request_page.close()

html_soup = BeautifulSoup(page_html, 'html.parser')

fact = html_soup.find_all('h2', class_="wow fadeInUp animated")

print(fact)

（因为只有一个标签，你可能要考虑使用find()而不是find_all()，以便使用.text方法获取文本）：

...
fact = html_soup.find('h2', class_="wow fadeInUp animated").text

美丽的汤不返回任何东西

Beautiful Soup not returning anything

python

urllib

beautifulsoup