为什么我的 urllib.request return 出现 http 错误 403?
Why does my urllib.request return a http error 403?
我正在尝试制作一个程序,使用 python 从站点下载一系列产品图片。该网站以某种 url 格式 https://www.sitename.com/XYZabcde 存储其图像,其中 XYZ 是代表产品品牌的三个字母,abcde 是介于 00000 和 30000 之间的一系列数字。
这是我的代码:
import urllib.request
def down(i, inp):
full_path = 'images/image-{}.jpg'.format(i)
url = "https://www.sitename.com/{}{}.jpg".format(inp,i)
urllib.request.urlretrieve(url, full_path)
print("saved")
return None
inp = input("brand :" )
i = 20100
while i <= 20105:
x = str(i)
y = x.zfill(5)
z = "https://www.sitename.com/{}{}.jpg".format(inp,y)
print(z)
down(y, inp)
i += 1
使用我编写的代码,我可以从中成功下载我知道存在的一系列图片,例如品牌 RVL 从 20100 到 20105 将成功下载这六张图片。
然而,当我扩大 while 循环以包含链接时,我不知道会给我一张图像,我得到这个错误代码:
Traceback (most recent call last):
File "c:/Users/euan/Desktop/university/programming/Python/parser/test - Copy.py", line 20, in <module>
down(y, inp)
File "c:/Users/euan/Desktop/university/programming/Python/parser/test - Copy.py", line 6, in down
urllib.request.urlretrieve(url, full_path)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
我该怎么做才能检查并避免任何会产生此结果的 url?
因此您无法提前知道哪些 URL 您无权访问,但您可以在下载内容周围加上 try-except:
import urllib.request, urllib.error
...
def down(i, inp):
full_path = 'images/image-{}.jpg'.format(i)
url = "https://www.sitename.com/{}{}.jpg".format(inp,i)
try:
urllib.request.urlretrieve(url, full_path)
print("saved")
except urllib.error.HTTPError as e:
print("failed:", e)
return None
在那种情况下,它只会打印例如“失败:HTTP 错误 403:禁止”每当无法获取 URL 时,程序将继续。
我正在尝试制作一个程序,使用 python 从站点下载一系列产品图片。该网站以某种 url 格式 https://www.sitename.com/XYZabcde 存储其图像,其中 XYZ 是代表产品品牌的三个字母,abcde 是介于 00000 和 30000 之间的一系列数字。 这是我的代码:
import urllib.request
def down(i, inp):
full_path = 'images/image-{}.jpg'.format(i)
url = "https://www.sitename.com/{}{}.jpg".format(inp,i)
urllib.request.urlretrieve(url, full_path)
print("saved")
return None
inp = input("brand :" )
i = 20100
while i <= 20105:
x = str(i)
y = x.zfill(5)
z = "https://www.sitename.com/{}{}.jpg".format(inp,y)
print(z)
down(y, inp)
i += 1
使用我编写的代码,我可以从中成功下载我知道存在的一系列图片,例如品牌 RVL 从 20100 到 20105 将成功下载这六张图片。 然而,当我扩大 while 循环以包含链接时,我不知道会给我一张图像,我得到这个错误代码:
Traceback (most recent call last):
File "c:/Users/euan/Desktop/university/programming/Python/parser/test - Copy.py", line 20, in <module>
down(y, inp)
File "c:/Users/euan/Desktop/university/programming/Python/parser/test - Copy.py", line 6, in down
urllib.request.urlretrieve(url, full_path)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\euan\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
我该怎么做才能检查并避免任何会产生此结果的 url?
因此您无法提前知道哪些 URL 您无权访问,但您可以在下载内容周围加上 try-except:
import urllib.request, urllib.error
...
def down(i, inp):
full_path = 'images/image-{}.jpg'.format(i)
url = "https://www.sitename.com/{}{}.jpg".format(inp,i)
try:
urllib.request.urlretrieve(url, full_path)
print("saved")
except urllib.error.HTTPError as e:
print("failed:", e)
return None
在那种情况下,它只会打印例如“失败:HTTP 错误 403:禁止”每当无法获取 URL 时,程序将继续。