尝试使用 Python 从 URL 下载文件时出现 HTTP 403 错误
HTTP 403 error when trying to download a file from URL using Python
我正在尝试从 URL -> https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519
下载文件
我可以通过浏览器访问 URL 来手动下载文件,文件会自动保存到本地计算机的“下载”文件夹中。 (文件格式为JSON)
但是,我需要使用 Python 脚本来实现。我尝试使用 urllib.request & wget,但在这两种情况下我都不断收到错误 -
urllib.request.urlretrieve(url, path)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Python 3
import urllib.request, json
with urllib.request.urlopen("https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20210329.json") as url:
data = json.loads(url.read().decode())
print(data)
Is there a workaround to this? Dealing with dynamic changes ?
您可以尝试以下脚本来下载 url 并下载 json 文件:
import requests
import re
import urllib.request
rq= requests.get("https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519")
t = re.search("https://download.microsoft.com/download/.*?\.json", rq.text )
a= t.group()
print(a)
path = r"$(Build.sourcesdirectory)\agent.json"
urllib.request.urlretrieve(a, path)
结果:
我正在尝试从 URL -> https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519
下载文件我可以通过浏览器访问 URL 来手动下载文件,文件会自动保存到本地计算机的“下载”文件夹中。 (文件格式为JSON)
但是,我需要使用 Python 脚本来实现。我尝试使用 urllib.request & wget,但在这两种情况下我都不断收到错误 -
urllib.request.urlretrieve(url, path)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Python 3
import urllib.request, json
with urllib.request.urlopen("https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20210329.json") as url:
data = json.loads(url.read().decode())
print(data)
Is there a workaround to this? Dealing with dynamic changes ?
您可以尝试以下脚本来下载 url 并下载 json 文件:
import requests
import re
import urllib.request
rq= requests.get("https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519")
t = re.search("https://download.microsoft.com/download/.*?\.json", rq.text )
a= t.group()
print(a)
path = r"$(Build.sourcesdirectory)\agent.json"
urllib.request.urlretrieve(a, path)
结果: