使用 requests/urllib3 在每次重试时添加回调函数
Adding callback function on each retry attempt using requests/urllib3
我已经使用 urllib3.util.retry
as suggested both here and .
对 requests
会话实施了重试机制
现在,我想弄清楚添加回调函数的最佳方法是什么,该回调函数将在每次重试尝试时调用。
进一步说明一下,如果 Retry
对象或请求 get
方法能够添加回调函数,那就太好了。也许是这样的:
import requests
from requests.packages.urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
def retry_callback(url):
print url
s = requests.Session()
retries = Retry(total=5, status_forcelist=[ 500, 502, 503, 504 ])
s.mount('http://', HTTPAdapter(max_retries=retries))
url = 'http://httpstat.us/500'
s.get(url, callback=retry_callback, callback_params=[url])
我知道打印 url 我可以使用日志记录,但这只是一个更复杂用途的简单示例。
您可以子class Retry
class 来添加该功能。
这是针对给定连接尝试与 Retry
实例的完整交互流程:
Retry.increment()
使用当前方法调用,url,响应 object(如果有),以及每当引发异常时的异常(如果引发),或者 return 编辑了 30x 重定向响应,或者 Retry.is_retry()
方法 return 为真。
.increment()
将 re-raise 错误(如果有的话)并且 object 配置为不重试特定的 class 错误。
.increment()
调用 Retry.new()
来创建一个更新的实例,其中更新了任何相关的计数器,并且 history
属性用新的 RequestHistory()
instance(命名元组)进行了修改。
如果 Retry.is_exhausted()
调用 Retry.new()
的 return 值为真,.increment()
将引发 MaxRetryError
异常。 is_exhausted()
return 当它跟踪的任何计数器降到 0 以下时为真(设置为 None
的计数器将被忽略)。
.increment()
returns 新的 Retry
实例。
Retry.increment()
的 return 值替换了跟踪的旧 Retry
实例。如果存在重定向,则调用 Retry.sleep_for_retry()
(如果存在 Retry-After
header,则休眠),否则调用 Retry.sleep()
(调用 self.sleep_for_retry()
以兑现a Retry-After
header,否则如果有 back-off 政策就只是睡觉)。然后使用新的 Retry
实例进行递归连接调用。
这给了你3个很好的回调点;在 .increment()
开始时,创建新的 Retry
实例时,以及在 super().increment()
周围的上下文管理器中让回调否决异常或更新 returned 重试策略退出时。
这就是在 .increment()
的开头放置一个钩子的样子:
import logging
logger = getLogger(__name__)
class CallbackRetry(Retry):
def __init__(self, *args, **kwargs):
self._callback = kwargs.pop('callback', None)
super(CallbackRetry, self).__init__(*args, **kwargs)
def new(self, **kw):
# pass along the subclass additional information when creating
# a new instance.
kw['callback'] = self._callback
return super(CallbackRetry, self).new(**kw)
def increment(self, method, url, *args, **kwargs):
if self._callback:
try:
self._callback(url)
except Exception:
logger.exception('Callback raised an exception, ignoring')
return super(CallbackRetry, self).increment(method, url, *args, **kwargs)
注意,url
参数实际上只是 URL 路径 ,省略了请求的网址部分(您必须从 _pool
参数中提取,它具有 .scheme
、.host
和 .port
属性)。
演示:
>>> def retry_callback(url):
... print('Callback invoked with', url)
...
>>> s = requests.Session()
>>> retries = CallbackRetry(total=5, status_forcelist=[500, 502, 503, 504], callback=retry_callback)
>>> s.mount('http://', HTTPAdapter(max_retries=retries))
>>> s.get('http://httpstat.us/500')
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Traceback (most recent call last):
File "/.../lib/python3.6/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
[Previous line repeated 1 more times]
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 712, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
File "<stdin>", line 8, in increment
File "/.../lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/.../lib/python3.6/site-packages/requests/adapters.py", line 499, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))
在 .new()
方法中放置一个钩子可以让您为下一次尝试调整策略,并让您反省 .history
属性,但不会让您避免异常 re-raising.
我已经使用 urllib3.util.retry
as suggested both here and
requests
会话实施了重试机制
现在,我想弄清楚添加回调函数的最佳方法是什么,该回调函数将在每次重试尝试时调用。
进一步说明一下,如果 Retry
对象或请求 get
方法能够添加回调函数,那就太好了。也许是这样的:
import requests
from requests.packages.urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
def retry_callback(url):
print url
s = requests.Session()
retries = Retry(total=5, status_forcelist=[ 500, 502, 503, 504 ])
s.mount('http://', HTTPAdapter(max_retries=retries))
url = 'http://httpstat.us/500'
s.get(url, callback=retry_callback, callback_params=[url])
我知道打印 url 我可以使用日志记录,但这只是一个更复杂用途的简单示例。
您可以子class Retry
class 来添加该功能。
这是针对给定连接尝试与 Retry
实例的完整交互流程:
Retry.increment()
使用当前方法调用,url,响应 object(如果有),以及每当引发异常时的异常(如果引发),或者 return 编辑了 30x 重定向响应,或者Retry.is_retry()
方法 return 为真。.increment()
将 re-raise 错误(如果有的话)并且 object 配置为不重试特定的 class 错误。.increment()
调用Retry.new()
来创建一个更新的实例,其中更新了任何相关的计数器,并且history
属性用新的RequestHistory()
instance(命名元组)进行了修改。
如果 .increment()
将引发MaxRetryError
异常。is_exhausted()
return 当它跟踪的任何计数器降到 0 以下时为真(设置为None
的计数器将被忽略)。.increment()
returns 新的Retry
实例。
Retry.is_exhausted()
调用Retry.new()
的 return 值为真,Retry.increment()
的 return 值替换了跟踪的旧Retry
实例。如果存在重定向,则调用Retry.sleep_for_retry()
(如果存在Retry-After
header,则休眠),否则调用Retry.sleep()
(调用self.sleep_for_retry()
以兑现aRetry-After
header,否则如果有 back-off 政策就只是睡觉)。然后使用新的Retry
实例进行递归连接调用。
这给了你3个很好的回调点;在 .increment()
开始时,创建新的 Retry
实例时,以及在 super().increment()
周围的上下文管理器中让回调否决异常或更新 returned 重试策略退出时。
这就是在 .increment()
的开头放置一个钩子的样子:
import logging
logger = getLogger(__name__)
class CallbackRetry(Retry):
def __init__(self, *args, **kwargs):
self._callback = kwargs.pop('callback', None)
super(CallbackRetry, self).__init__(*args, **kwargs)
def new(self, **kw):
# pass along the subclass additional information when creating
# a new instance.
kw['callback'] = self._callback
return super(CallbackRetry, self).new(**kw)
def increment(self, method, url, *args, **kwargs):
if self._callback:
try:
self._callback(url)
except Exception:
logger.exception('Callback raised an exception, ignoring')
return super(CallbackRetry, self).increment(method, url, *args, **kwargs)
注意,url
参数实际上只是 URL 路径 ,省略了请求的网址部分(您必须从 _pool
参数中提取,它具有 .scheme
、.host
和 .port
属性)。
演示:
>>> def retry_callback(url):
... print('Callback invoked with', url)
...
>>> s = requests.Session()
>>> retries = CallbackRetry(total=5, status_forcelist=[500, 502, 503, 504], callback=retry_callback)
>>> s.mount('http://', HTTPAdapter(max_retries=retries))
>>> s.get('http://httpstat.us/500')
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Traceback (most recent call last):
File "/.../lib/python3.6/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
[Previous line repeated 1 more times]
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 712, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
File "<stdin>", line 8, in increment
File "/.../lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/.../lib/python3.6/site-packages/requests/adapters.py", line 499, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))
在 .new()
方法中放置一个钩子可以让您为下一次尝试调整策略,并让您反省 .history
属性,但不会让您避免异常 re-raising.