如何错开异步 API 调用以防止使用 grequests 库进行最大重试?
How to stagger asynchronous API calls to prevent Max retries with grequests library?
我有一个我需要检索的 API 的 ~250K url 列表。
我使用 grequests
制作了一个 class,它完全按照我想要的方式工作,除了,我认为它工作得太快了,因为在 运行 遍历了整个 URL 列表之后我收到错误:
Problem: url: HTTPSConnectionPool(host='url', port=123): Max retries exceeded with url: url (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x38f466c18>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
到目前为止的代码:
import grequests
lst = ['url','url2',url3']
class Test:
def __init__(self):
self.urls = lst
def exception(self, request, exception):
print ("Problem: {}: {}".format(request.url, exception))
def async(self):
return grequests.map((grequests.get(u) for u in self.urls), exception_handler=self.exception, size=5)
def collate_responses(self, results):
return [x.text for x in results]
test = Test()
#here we collect the results returned by the async function
results = test.async()
如何将代码放慢一点以防止 'Max retries error'?或者更好的是,我如何分块我拥有的列表并分块传递 URL?
在 mac 上使用 python3.6。
编辑:
问题不重复,必须将许多 URL 传递到同一个端点。
尝试用循环替换 greqeusts.map 并添加睡眠
for u in self.urls:
req = grequests.get(u)
job = grequests.send(req)
sleep(5)
similar issue resolved with sleep
我有一个我需要检索的 API 的 ~250K url 列表。
我使用 grequests
制作了一个 class,它完全按照我想要的方式工作,除了,我认为它工作得太快了,因为在 运行 遍历了整个 URL 列表之后我收到错误:
Problem: url: HTTPSConnectionPool(host='url', port=123): Max retries exceeded with url: url (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x38f466c18>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
到目前为止的代码:
import grequests
lst = ['url','url2',url3']
class Test:
def __init__(self):
self.urls = lst
def exception(self, request, exception):
print ("Problem: {}: {}".format(request.url, exception))
def async(self):
return grequests.map((grequests.get(u) for u in self.urls), exception_handler=self.exception, size=5)
def collate_responses(self, results):
return [x.text for x in results]
test = Test()
#here we collect the results returned by the async function
results = test.async()
如何将代码放慢一点以防止 'Max retries error'?或者更好的是,我如何分块我拥有的列表并分块传递 URL?
在 mac 上使用 python3.6。
编辑:
问题不重复,必须将许多 URL 传递到同一个端点。
尝试用循环替换 greqeusts.map 并添加睡眠
for u in self.urls:
req = grequests.get(u)
job = grequests.send(req)
sleep(5)
similar issue resolved with sleep