如何在BeautifulSouppython中获取实时进度条?
how I can get the real-time progress bar in BeautifulSoup python?
我有以下代码,该代码从 Redbubble 等网站上抓取了一些数据。有时我抓取了很多数据,我想知道代码中的实时进度......我尝试了progressbar模块但我没有得到我想要的......
import requests
from bs4 import BeautifulSoup
re = requests.get('https://www.redbubble.com/i/iphone-case/What-A-Time-To-Be-Alive-by-DinoMike/36490886.RIOBD')
src = re.content
soup = BeautifulSoup(src, "html.parser")
tags = soup.find_all("span", {"class" : "styles__children--21o3C"})
print(tags)
如果您有多个页面要请求,这里有一个很酷的库,tqdm
,它显示了一个进度条。
import requests
from bs4 import BeautifulSoup
from tqdm import tqdm
# set of target URLs
urls = [
"https://www.redbubble.com/i/iphone-case/What-A-Time-To-Be-Alive-by-DinoMike/36490886.RIOBD",
...
]
set_tags = []
# go through the list
for url in tqdm(urls):
# get request
soup = BeautifulSoup(requests.get(url).content, "html.parser")
tags = soup.find_all("span", {"class": "styles__children--21o3C"})
set_tags.append(tags)
我有以下代码,该代码从 Redbubble 等网站上抓取了一些数据。有时我抓取了很多数据,我想知道代码中的实时进度......我尝试了progressbar模块但我没有得到我想要的......
import requests
from bs4 import BeautifulSoup
re = requests.get('https://www.redbubble.com/i/iphone-case/What-A-Time-To-Be-Alive-by-DinoMike/36490886.RIOBD')
src = re.content
soup = BeautifulSoup(src, "html.parser")
tags = soup.find_all("span", {"class" : "styles__children--21o3C"})
print(tags)
如果您有多个页面要请求,这里有一个很酷的库,tqdm
,它显示了一个进度条。
import requests
from bs4 import BeautifulSoup
from tqdm import tqdm
# set of target URLs
urls = [
"https://www.redbubble.com/i/iphone-case/What-A-Time-To-Be-Alive-by-DinoMike/36490886.RIOBD",
...
]
set_tags = []
# go through the list
for url in tqdm(urls):
# get request
soup = BeautifulSoup(requests.get(url).content, "html.parser")
tags = soup.find_all("span", {"class": "styles__children--21o3C"})
set_tags.append(tags)