可以使用 headers 获取 url 的内容并将其写入文件 (python 3.7)
Possibility to get content of urls with headers and write it to a file (python 3.7)
我有多个在查询字符串参数方面不同的 url,每天都会发送邮件,例如:
urls = [f'https://example.com?query=from-{x+1}d+TO+-{x}d%data' for x in range(10)]
我想将所有这些网址的内容写入一个文件。我试过 urllib.requests:
import urllib.request
key = "some value"
requests = urllib.request.Request([url for url in urls], headers={"key":key})
<urllib.request.Request object at 0x7f48e8381490>
但第一个陷阱是 'Request' object 不可迭代
responses = urllib.request.urlopen([request for request in requests])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Request' object is not iterable
理想情况下,结果可以转到如下文件:
data = open('file_name', 'a')
data.write([response.read() for response in responses])
我也试过请求库
import requests
test = requests.Session()
r = test.get([url for url in urls], headers={"key":key})
但这失败了
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for <list of urls>
有没有办法用 headers 获取这些 url 的内容并将其发送到文件?
我想你可能想做这样的事情:
import urllib.request
with open("file_name", "a") as data:
for url in urls:
req = urllib.request.Request(url, headers={"key": "key"})
with urllib.request.urlopen(req) as response:
html = response.read()
data.write(html)
我有多个在查询字符串参数方面不同的 url,每天都会发送邮件,例如:
urls = [f'https://example.com?query=from-{x+1}d+TO+-{x}d%data' for x in range(10)]
我想将所有这些网址的内容写入一个文件。我试过 urllib.requests:
import urllib.request
key = "some value"
requests = urllib.request.Request([url for url in urls], headers={"key":key})
<urllib.request.Request object at 0x7f48e8381490>
但第一个陷阱是 'Request' object 不可迭代
responses = urllib.request.urlopen([request for request in requests])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Request' object is not iterable
理想情况下,结果可以转到如下文件:
data = open('file_name', 'a')
data.write([response.read() for response in responses])
我也试过请求库
import requests
test = requests.Session()
r = test.get([url for url in urls], headers={"key":key})
但这失败了
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for <list of urls>
有没有办法用 headers 获取这些 url 的内容并将其发送到文件?
我想你可能想做这样的事情:
import urllib.request
with open("file_name", "a") as data:
for url in urls:
req = urllib.request.Request(url, headers={"key": "key"})
with urllib.request.urlopen(req) as response:
html = response.read()
data.write(html)