解决 HTTP 错误 400:在 Google Chrome 中使用有效链接的错误请求
Solving HTTP Error 400: Bad Request with working links in Google Chrome
我知道已经有很多人问过这个问题,但我似乎找不到答案,希望能在这里得到一些帮助。
我尝试下载存储在 url 列表后面的文件..
我发现以下函数可以满足我的要求:
import os.path
import urllib.request
import requests
for link in links:
link = link.strip()
name = link.rsplit('/', 1)[-1]
filename = os.path.join('downloads', name)
if not os.path.isfile(filename):
print('Downloading: ' + filename)
try:
urllib.request.urlretrieve(link, filename)
except Exception as inst:
print(inst)
print(' Encountered unknown error. Continuing.')
我总是收到:HTTP 错误 400:错误请求。
我试图设置用户代理来伪造浏览器访问(我使用 Google Chrome),但它根本没有帮助。如果在浏览器中复制这些链接就可以使用,因此我想知道如何解决这个问题。
必须引用空格。我已经使用了 quote function to quote filename in your link. Also I've used rindex to cut last part in url path. There is urlsplit and urlunsplit 应该使用的函数而不是字符串操作,但是 .. 我太懒了 :D
import os.path
import urllib.request
from urllib.parse import quote
links = ['https://undpgefpims.org/attachments/6222/216410/1717887/1724973/6222_4NC_3BUR_Macedonia_Final ProDoc 30 July 2018.doc', 'https://undpgefpims.org/attachments/6214/216405/1719672/1729436/6214_4NC_Niger_ProDoc final for DoA.doc']
for link in links:
link = link.strip()
name = link.rsplit('/', 1)[-1]
filename = os.path.join('downloads', name)
if not os.path.isfile(filename):
print('Downloading: ' + filename)
try:
urllib.request.urlretrieve(link[:link.rindex('/') + 1] + quote(link[link.rindex('/') + 1:]), filename)
except Exception as inst:
print(inst)
print(' Encountered unknown error. Continuing.')
我找到了我自己问题的答案。
问题是 url 包含空格,urllib.request
显然无法正确读取这些空格。解决办法是先把url解析成引号,然后调用被引号的url.
这是所有 运行 遇到相同问题的工作代码:
import os.path
import urllib.request
import requests
import urllib.parse
for link in urls:
link = link.strip()
name = link.rsplit('/', 1)[-1]
filename = os.path.join(name)
quoted_url = urllib.parse.quote(link, safe=":/")
if not os.path.isfile(filename):
print('Downloading: ' + filename)
try:
urllib.request.urlretrieve(quoted_url, filename)
except Exception as inst:
print(inst)
print(' Encountered unknown error. Continuing.')
我知道已经有很多人问过这个问题,但我似乎找不到答案,希望能在这里得到一些帮助。 我尝试下载存储在 url 列表后面的文件..
我发现以下函数可以满足我的要求:
import os.path
import urllib.request
import requests
for link in links:
link = link.strip()
name = link.rsplit('/', 1)[-1]
filename = os.path.join('downloads', name)
if not os.path.isfile(filename):
print('Downloading: ' + filename)
try:
urllib.request.urlretrieve(link, filename)
except Exception as inst:
print(inst)
print(' Encountered unknown error. Continuing.')
我总是收到:HTTP 错误 400:错误请求。
我试图设置用户代理来伪造浏览器访问(我使用 Google Chrome),但它根本没有帮助。如果在浏览器中复制这些链接就可以使用,因此我想知道如何解决这个问题。
必须引用空格。我已经使用了 quote function to quote filename in your link. Also I've used rindex to cut last part in url path. There is urlsplit and urlunsplit 应该使用的函数而不是字符串操作,但是 .. 我太懒了 :D
import os.path
import urllib.request
from urllib.parse import quote
links = ['https://undpgefpims.org/attachments/6222/216410/1717887/1724973/6222_4NC_3BUR_Macedonia_Final ProDoc 30 July 2018.doc', 'https://undpgefpims.org/attachments/6214/216405/1719672/1729436/6214_4NC_Niger_ProDoc final for DoA.doc']
for link in links:
link = link.strip()
name = link.rsplit('/', 1)[-1]
filename = os.path.join('downloads', name)
if not os.path.isfile(filename):
print('Downloading: ' + filename)
try:
urllib.request.urlretrieve(link[:link.rindex('/') + 1] + quote(link[link.rindex('/') + 1:]), filename)
except Exception as inst:
print(inst)
print(' Encountered unknown error. Continuing.')
我找到了我自己问题的答案。
问题是 url 包含空格,urllib.request
显然无法正确读取这些空格。解决办法是先把url解析成引号,然后调用被引号的url.
这是所有 运行 遇到相同问题的工作代码:
import os.path
import urllib.request
import requests
import urllib.parse
for link in urls:
link = link.strip()
name = link.rsplit('/', 1)[-1]
filename = os.path.join(name)
quoted_url = urllib.parse.quote(link, safe=":/")
if not os.path.isfile(filename):
print('Downloading: ' + filename)
try:
urllib.request.urlretrieve(quoted_url, filename)
except Exception as inst:
print(inst)
print(' Encountered unknown error. Continuing.')