我如何从 Zomato 中提取卫生等级?
How can I extract hygiene ratings from Zomato?
我正在开展一个项目,希望分析德里 Zomato 上列出的餐厅的卫生评级。我能够使用 Zomato /search API 获取餐厅详细信息,但 API 不提供餐厅的卫生等级。
我尝试抓取,但我一直收到错误消息。
网页抓取代码:
# import the library we use to open URLs
import urllib.request
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'}
response = requests.get("https://www.zomato.com/ncr/pearl-boutique-bakery-cafe-greater-kailash-gk-2-new-delhi",headers=headers)
content = response.content
# open the url using urllib.request and put the HTML into the page variable
#page = urllib.request.urlopen(url)
# parse the HTML from our URL into the BeautifulSoup parse tree format
soup = BeautifulSoup(content, "lxml")
print(soup.prettify())
我不断收到以下错误:
ConnectionError: ('Connection aborted.', OSError("(104, 'ECONNRESET')",))
有没有其他方法可以从 Zomato 中提取餐厅的卫生等级?
指定http头User-Agent
获取页面。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.zomato.com/ncr/pearl-boutique-bakery-cafe-greater-kailash-gk-2-new-delhi'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
print(soup.select_one('p:contains("HYGIENE RATING") + p').get_text(strip=True))
打印:
5 - Excellent
我正在开展一个项目,希望分析德里 Zomato 上列出的餐厅的卫生评级。我能够使用 Zomato /search API 获取餐厅详细信息,但 API 不提供餐厅的卫生等级。
我尝试抓取,但我一直收到错误消息。
网页抓取代码:
# import the library we use to open URLs
import urllib.request
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'}
response = requests.get("https://www.zomato.com/ncr/pearl-boutique-bakery-cafe-greater-kailash-gk-2-new-delhi",headers=headers)
content = response.content
# open the url using urllib.request and put the HTML into the page variable
#page = urllib.request.urlopen(url)
# parse the HTML from our URL into the BeautifulSoup parse tree format
soup = BeautifulSoup(content, "lxml")
print(soup.prettify())
我不断收到以下错误:
ConnectionError: ('Connection aborted.', OSError("(104, 'ECONNRESET')",))
有没有其他方法可以从 Zomato 中提取餐厅的卫生等级?
指定http头User-Agent
获取页面。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.zomato.com/ncr/pearl-boutique-bakery-cafe-greater-kailash-gk-2-new-delhi'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
print(soup.select_one('p:contains("HYGIENE RATING") + p').get_text(strip=True))
打印:
5 - Excellent