为什么请求在 python 中的某个时间后停止?
why request stop after a certain time in python?
我有这段代码,它的功能是发送一个Jetta类型的请求,从请求中获取文本,网站链接是从文本文件中读取的,问题是在发送300或500个请求后,脚本没有显示任何错误就停止了,只是停止工作了??
import requests
sites = open(r'site.txt', 'r', encoding="utf8").readlines()
l_site = []
for i in sites:
l_site.append(i)
for x in len(l_site):
result = requests.get(f'{site}', allow_redirects=True).text
open('result.txt', 'a').write(f'{result}\n')
我想你的函数比你在这里放的要多,因为我看不到 site
变量是在哪里创建的。
您可以按照这些思路做一些事情,以更好地了解它停止的位置。
import requests
sites = open(r'site.txt', 'r', encoding="utf8").readlines()
l_site = [s for s in sites]
with open('result.txt', 'a') as fb:
for site in l_site:
try:
print(f"Processing {site}")
result = requests.get(f'{site}', allow_redirects=True).text
fb.write(f'{result}\n')
except Exception as e:
raise e
如果我没理解错的话,这就是你想要的:
- 阅读自
site.txt
- 如果 http 请求成功,将响应负载附加到
result.txt
- 如果 http 请求由于超时而失败,将带有 url 的结果附加到另一个文件
这是一段运行的代码。请注意,如果您想捕获更多类型的错误,可以更改 except
部分。
import requests
URLS_FILE = 'site.txt'
RESULT_FILE = 'result.txt'
ERRORS_FILE = 'result-error.txt'
def handle_url(url: str, result_file, error_file):
try:
# 10 seconds timeout, not download time, but time to get an HTTP response
content = requests.get(url, allow_redirects=True, timeout=10)
result_file.write(f'{content.text}\n')
except requests.exceptions.ConnectTimeout as e:
error_file.write(f'{url}: {e}\n')
with open(URLS_FILE, 'r', encoding="utf8") as f:
with open(RESULT_FILE, 'a') as rf:
with open(ERRORS_FILE, 'a') as ef:
for url in f.readlines():
handle_url(url, rf, ef)
我有这段代码,它的功能是发送一个Jetta类型的请求,从请求中获取文本,网站链接是从文本文件中读取的,问题是在发送300或500个请求后,脚本没有显示任何错误就停止了,只是停止工作了??
import requests
sites = open(r'site.txt', 'r', encoding="utf8").readlines()
l_site = []
for i in sites:
l_site.append(i)
for x in len(l_site):
result = requests.get(f'{site}', allow_redirects=True).text
open('result.txt', 'a').write(f'{result}\n')
我想你的函数比你在这里放的要多,因为我看不到 site
变量是在哪里创建的。
您可以按照这些思路做一些事情,以更好地了解它停止的位置。
import requests
sites = open(r'site.txt', 'r', encoding="utf8").readlines()
l_site = [s for s in sites]
with open('result.txt', 'a') as fb:
for site in l_site:
try:
print(f"Processing {site}")
result = requests.get(f'{site}', allow_redirects=True).text
fb.write(f'{result}\n')
except Exception as e:
raise e
如果我没理解错的话,这就是你想要的:
- 阅读自
site.txt
- 如果 http 请求成功,将响应负载附加到
result.txt
- 如果 http 请求由于超时而失败,将带有 url 的结果附加到另一个文件
这是一段运行的代码。请注意,如果您想捕获更多类型的错误,可以更改 except
部分。
import requests
URLS_FILE = 'site.txt'
RESULT_FILE = 'result.txt'
ERRORS_FILE = 'result-error.txt'
def handle_url(url: str, result_file, error_file):
try:
# 10 seconds timeout, not download time, but time to get an HTTP response
content = requests.get(url, allow_redirects=True, timeout=10)
result_file.write(f'{content.text}\n')
except requests.exceptions.ConnectTimeout as e:
error_file.write(f'{url}: {e}\n')
with open(URLS_FILE, 'r', encoding="utf8") as f:
with open(RESULT_FILE, 'a') as rf:
with open(ERRORS_FILE, 'a') as ef:
for url in f.readlines():
handle_url(url, rf, ef)