可以使用 headers 获取 url 的内容并将其写入文件 (python 3.7)

Question

我有多个在查询字符串参数方面不同的 url，每天都会发送邮件，例如：

urls = [f'https://example.com?query=from-{x+1}d+TO+-{x}d%data' for x in range(10)]

我想将所有这些网址的内容写入一个文件。我试过 urllib.requests:

import urllib.request

key = "some value"
requests = urllib.request.Request([url for url in urls], headers={"key":key})
<urllib.request.Request object at 0x7f48e8381490>

但第一个陷阱是 'Request' object 不可迭代

responses = urllib.request.urlopen([request for request in requests])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Request' object is not iterable

理想情况下，结果可以转到如下文件：

data = open('file_name', 'a')
data.write([response.read() for response in responses])

我也试过请求库

import requests
test = requests.Session()
r = test.get([url for url in urls], headers={"key":key})

但这失败了

    raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for <list of urls>

有没有办法用 headers 获取这些 url 的内容并将其发送到文件？

Answer 1

我想你可能想做这样的事情：

import urllib.request

with open("file_name", "a") as data:
    for url in urls:
        req = urllib.request.Request(url, headers={"key": "key"})
        with urllib.request.urlopen(req) as response:
            html = response.read()
            data.write(html)

可以使用 headers 获取 url 的内容并将其写入文件 (python 3.7)

Possibility to get content of urls with headers and write it to a file (python 3.7)

python

urllib

python-requests