Requests 或 Urllib - 登录网站,发送下载请求到 url,并保存为 xlsx

Requests or Urllib - Login in a website, send download request to url, and save as xlsx

我被以下问题弄疯了。我想做的是登录网站,下载文件,同时将下载请求保存为 xlsx。

我很确定我需要使用请求库,但似乎不知道具体如何操作。这是我目前所拥有的:

import requests



# URL Data



login_url = 'https://reporting.integralplatform.com/uaa/login#/'

report_url = 'https://integralplatform.com/home/brand-safety/firewall?period=%5B2016-09-01..2017-02-19%5D&publisher=all&placement=all&deliveryEnvironment=%5Bdesktop%5D&includeCampaign=true&campaigns=52921%3A52919%3A52931%3A52922%3A52933%3A54272%3A52934%3A54370%3A54363%3A54372%3A54362%3A54369%3A54368%3A54366%3A54365%3A54367&includePlacement=true&grouping=placementName&dateGroup=daily'

download_url = 'https://integralplatform.com/reportingservice/api/teams/3236/fw/campaigns/52921%253A52919%253A52931%253A52922%253A52933%253A54272%253A52934/report.xls?period=%5B2016-09-01..2017-02-19%5D&cutoff=250&mediaType=mixed&groups=%5Bcamp%3Apub%3Aplac%3Adaily%5D&tabs=%5Bfirewall%5D&settings=%7B%22Selected%20Report%22%3A%22Firewall%20Activity%22%2C%22Group%20Dates%20By%22%3A%22Day%22%2C%22Report%20By%22%3A%22Campaign%22%2C%22Campaign%22%3A%22%25%25CAMPAIGN_NAMES%25%25%22%2C%22Media%20Partner%22%3A%22All%22%2C%22Placement%22%3A%22Yes%22%2C%22Geo%20Level%22%3A%22Country%22%2C%22Cutoff%22%3A%22250%22%7D'



# Payload



payload = {

    "username" : 'my username'

,

    "password": 'my password',



    "_csrf_uaa": "507be70c-d4ff-4ea7-a3bf-d45cad3faa47",

}



# Authenticate



login = requests.post(login_url, data=payload)



# Download File



download  = requests.post(download_url, data=payload)

但是,当我同时查看 login.content 和 download.content 时,我似乎连登录都失败了,结果是:

b'<!DOCTYPE html>\n<html lang="en" ng-app="iasLogin">\n<head>\n\n    <meta charset="UTF-8">\n\n    <title>IAS Login</title>\n\n\n    <!-- Start Vendor CSS -->\n    <link rel="stylesheet" href="css/ias-app-vendor.min.css">\n    <!-- End Vendor CSS -->\n\n    <!-- Start IAS CSS -->\n    <link rel="stylesheet" href="css/ias-app.min.css">\n    <!-- End IAS CSS -->\n\n</head>\n<body>\n\n    <ias-headers></ias-headers>\n\n    <div ui-view></div>\n\n    <!-- Start Vendor JS -->\n    <script src="js/ias-app-vendor.min.js"></script>\n    <!-- End Vendor JS -->\n\n    <!-- Start IAS JS -->\n    <script src="js/ias-app.min.js"></script>\n    <!-- End IAS JS -->\n\n</body>\n</html>'

在我看来,在有效载荷方面,我显然做错了什么。但是,我不知道如何解决它。

澄清一下,report_url 和 download_url 之间的区别在于 download_url 是我右键单击下载按钮时收到的 url。参数固定。

感谢大家的帮助

您可能需要包含 headers 页面提交的表单数据以及您当前的负载。