如何获得下载 link 需要在附加对话框中勾选复选框
How to get a download link which requires checkboxes checking in additional dialog box
我想从 https://sam.gov/data-services/Exclusions/Public%20V2?privacy=Public
下载最后一个公开可用的文件
尝试手动下载时,实际下载 link 看起来像:
我的函数只得到初始link,它指的是真正的link:
import json
import requests
from operator import itemgetter
files_url = 'https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public'
def get_file():
response = requests.get(files_url, stream=True)
links_resp = json.loads(response.text)
links_dicts = [d for d in links_resp['_embedded']['customS3ObjectSummaryList'] if d['displayKey'].count('SAM_Exclus')]
sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
return sorted_links[0]['_links']['self']['href']
get_file()
结果:
'https://s3.amazonaws.com/falextracts/Exclusions/Public V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP'
但是按照上面的 link,我得到 拒绝访问
所以我会感谢任何关于如何获得真正下载的提示 links
我已经尽可能多地编辑了您的代码,以便您能够理解。请求库可以将其转换为 json 本身。
不在代码开头的导入看起来不太适合阅读...
import requests as req
from operator import itemgetter
files_url = "https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public"
down_url = "https://sam.gov/api/prod/fileextractservices/v1/api/download/Exclusions/Public%20V2/{}?privacy=Public"
def get_file():
response = req.get(files_url, stream=True).json()
links_dicts = [d for d in response["_embedded"]["customS3ObjectSummaryList"]]
sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
key = sorted_links[0]['displayKey']
down = req.get(down_url.format(key))
if not down.status_code == 200:
return False
print(key)
open(key, 'wb').write(down.content)
return True
get_file()
我想从 https://sam.gov/data-services/Exclusions/Public%20V2?privacy=Public
下载最后一个公开可用的文件尝试手动下载时,实际下载 link 看起来像:
我的函数只得到初始link,它指的是真正的link:
import json
import requests
from operator import itemgetter
files_url = 'https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public'
def get_file():
response = requests.get(files_url, stream=True)
links_resp = json.loads(response.text)
links_dicts = [d for d in links_resp['_embedded']['customS3ObjectSummaryList'] if d['displayKey'].count('SAM_Exclus')]
sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
return sorted_links[0]['_links']['self']['href']
get_file()
结果:
'https://s3.amazonaws.com/falextracts/Exclusions/Public V2/SAM_Exclusions_Public_Extract_V2_22150.ZIP'
但是按照上面的 link,我得到 拒绝访问
所以我会感谢任何关于如何获得真正下载的提示 links
我已经尽可能多地编辑了您的代码,以便您能够理解。请求库可以将其转换为 json 本身。
不在代码开头的导入看起来不太适合阅读...
import requests as req
from operator import itemgetter
files_url = "https://sam.gov/api/prod/fileextractservices/v1/api/listfiles?random=1653676394983&domain=Exclusions/Public%20V2&privacy=Public"
down_url = "https://sam.gov/api/prod/fileextractservices/v1/api/download/Exclusions/Public%20V2/{}?privacy=Public"
def get_file():
response = req.get(files_url, stream=True).json()
links_dicts = [d for d in response["_embedded"]["customS3ObjectSummaryList"]]
sorted_links = sorted(links_dicts, key=itemgetter('dateModified'), reverse=True)
key = sorted_links[0]['displayKey']
down = req.get(down_url.format(key))
if not down.status_code == 200:
return False
print(key)
open(key, 'wb').write(down.content)
return True
get_file()