使用 Python 从 URL 下载会出现 No Host Supplied 错误
Download with Python from URL gives No Host Supplied error
我正在尝试制作可下载漫画的应用程序,但每当我尝试下载图片时,它都会提示未提供主机。
我真的搜了,什么都没有
这是代码:
import requests,bs4
url='https://www.marvel.com/comics/issue/71314/edge_of_spider-geddon_2018_1'
res=requests.get(url,stream=True)
res.raise_for_status()
soup=bs4.BeautifulSoup(res.text)
elem=soup.select('div[class="row-item-image"] img')#.viewer-cnt .row .col-xs-12 #ppp img')
#print(elem)
comicurl='https:'+elem[0].get('src')
res=requests.get(comicurl,stream=True,allow_redirects=True)
res.raise_for_status()
with open(comicurl[comicurl.rfind('/')+1:],'wb') as i:
for chunk in res.iter_content(100000):
i.write(chunk)
我希望它能下载图像,但它给我这个错误:
Traceback (most recent call last):
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\comicdownloader.py", line 10, in <module>
res=requests.get(comicurl,stream=True,allow_redirects=True)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 519, in request
prep = self.prepare_request(req)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 462, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 313, in prepare
self.prepare_url(url, params)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 390, in prepare_url
raise InvalidURL("Invalid URL %r: No host supplied" % url)
requests.exceptions.InvalidURL: Invalid URL 'https:https://i.annihil.us/u/prod/marvel/i/mg/6/b0/5b6c5e4154f75/portrait_uncanny.jpg': No host supplied
每当我在任何网站上尝试时,它都会给我。
看起来 elem[0].get('src')
的计算结果为 https://i.annihil.us/u/prod/marvel/i/mg/6/b0/5b6c5e4154f75/portrait_uncanny.jpg
。
所以在第 comicurl='https:'+elem[0].get('src')
行,你在已经正确形成的 url 前面添加 http:
,使其无效
无可争辩:Invalid URL 'https:https://i.annihil.us/u/prod
-- URL 确实无效,可能您应该在以下语句中删除 https
:
comicurl='https:'+elem[0].get('src')
我正在尝试制作可下载漫画的应用程序,但每当我尝试下载图片时,它都会提示未提供主机。
我真的搜了,什么都没有
这是代码:
import requests,bs4
url='https://www.marvel.com/comics/issue/71314/edge_of_spider-geddon_2018_1'
res=requests.get(url,stream=True)
res.raise_for_status()
soup=bs4.BeautifulSoup(res.text)
elem=soup.select('div[class="row-item-image"] img')#.viewer-cnt .row .col-xs-12 #ppp img')
#print(elem)
comicurl='https:'+elem[0].get('src')
res=requests.get(comicurl,stream=True,allow_redirects=True)
res.raise_for_status()
with open(comicurl[comicurl.rfind('/')+1:],'wb') as i:
for chunk in res.iter_content(100000):
i.write(chunk)
我希望它能下载图像,但它给我这个错误:
Traceback (most recent call last):
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\comicdownloader.py", line 10, in <module>
res=requests.get(comicurl,stream=True,allow_redirects=True)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 519, in request
prep = self.prepare_request(req)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 462, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 313, in prepare
self.prepare_url(url, params)
File "C:\Users\Islam\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\models.py", line 390, in prepare_url
raise InvalidURL("Invalid URL %r: No host supplied" % url)
requests.exceptions.InvalidURL: Invalid URL 'https:https://i.annihil.us/u/prod/marvel/i/mg/6/b0/5b6c5e4154f75/portrait_uncanny.jpg': No host supplied
每当我在任何网站上尝试时,它都会给我。
看起来 elem[0].get('src')
的计算结果为 https://i.annihil.us/u/prod/marvel/i/mg/6/b0/5b6c5e4154f75/portrait_uncanny.jpg
。
所以在第 comicurl='https:'+elem[0].get('src')
行,你在已经正确形成的 url 前面添加 http:
,使其无效
无可争辩:Invalid URL 'https:https://i.annihil.us/u/prod
-- URL 确实无效,可能您应该在以下语句中删除 https
:
comicurl='https:'+elem[0].get('src')