使用 requests/urllib3 在每次重试时添加回调函数

Adding callback function on each retry attempt using requests/urllib3

我已经使用 urllib3.util.retry as suggested both here and .

requests 会话实施了重试机制

现在,我想弄清楚添加回调函数的最佳方法是什么,该回调函数将在每次重试尝试时调用。

进一步说明一下,如果 Retry 对象或请求 get 方法能够添加回调函数,那就太好了。也许是这样的:

import requests
from requests.packages.urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

def retry_callback(url):
    print url   

s = requests.Session()
retries = Retry(total=5, status_forcelist=[ 500, 502, 503, 504 ])
s.mount('http://', HTTPAdapter(max_retries=retries))

url = 'http://httpstat.us/500'
s.get(url, callback=retry_callback, callback_params=[url])

我知道打印 url 我可以使用日志记录,但这只是一个更复杂用途的简单示例。

您可以子class Retry class 来添加该功能。

这是针对给定连接尝试与 Retry 实例的完整交互流程:

  • Retry.increment() 使用当前方法调用,url,响应 object(如果有),以及每当引发异常时的异常(如果引发),或者 return 编辑了 30x 重定向响应,或者 Retry.is_retry() 方法 return 为真。
    • .increment() 将 re-raise 错误(如果有的话)并且 object 配置为不重试特定的 class 错误。
    • .increment() 调用 Retry.new() 来创建一个更新的实例,其中更新了任何相关的计数器,并且 history 属性用新的 RequestHistory() instance(命名元组)进行了修改。
    • 如果 Retry.is_exhausted() 调用 Retry.new() 的 return 值为真,
    • .increment() 将引发 MaxRetryError 异常。 is_exhausted() return 当它跟踪的任何计数器降到 0 以下时为真(设置为 None 的计数器将被忽略)。
    • .increment() returns 新的 Retry 实例。
  • Retry.increment() 的 return 值替换了跟踪的旧 Retry 实例。如果存在重定向,则调用 Retry.sleep_for_retry()(如果存在 Retry-After header,则休眠),否则调用 Retry.sleep()(调用 self.sleep_for_retry() 以兑现a Retry-After header,否则如果有 back-off 政策就只是睡觉)。然后使用新的 Retry 实例进行递归连接调用。

这给了你3个很好的回调点;在 .increment() 开始时,创建新的 Retry 实例时,以及在 super().increment() 周围的上下文管理器中让回调否决异常或更新 returned 重试策略退出时。

这就是在 .increment() 的开头放置一个钩子的样子:

import logging

logger = getLogger(__name__)

class CallbackRetry(Retry):
    def __init__(self, *args, **kwargs):
        self._callback = kwargs.pop('callback', None)
        super(CallbackRetry, self).__init__(*args, **kwargs)
    def new(self, **kw):
        # pass along the subclass additional information when creating
        # a new instance.
        kw['callback'] = self._callback
        return super(CallbackRetry, self).new(**kw)
    def increment(self, method, url, *args, **kwargs):
        if self._callback:
            try:
                self._callback(url)
            except Exception:
                logger.exception('Callback raised an exception, ignoring')
        return super(CallbackRetry, self).increment(method, url, *args, **kwargs)

注意,url 参数实际上只是 URL 路径 ,省略了请求的网址部分(您必须从 _pool 参数中提取,它具有 .scheme.host.port 属性)。

演示:

>>> def retry_callback(url):
...     print('Callback invoked with', url)
...
>>> s = requests.Session()
>>> retries = CallbackRetry(total=5, status_forcelist=[500, 502, 503, 504], callback=retry_callback)
>>> s.mount('http://', HTTPAdapter(max_retries=retries))
>>> s.get('http://httpstat.us/500')
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Traceback (most recent call last):
  File "/.../lib/python3.6/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
    body_pos=body_pos, **response_kw)
  File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
    body_pos=body_pos, **response_kw)
  File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
    body_pos=body_pos, **response_kw)
  [Previous line repeated 1 more times]
  File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 712, in urlopen
    retries = retries.increment(method, url, response=response, _pool=self)
  File "<stdin>", line 8, in increment
  File "/.../lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.../lib/python3.6/site-packages/requests/sessions.py", line 521, in get
    return self.request('GET', url, **kwargs)
  File "/.../lib/python3.6/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/.../lib/python3.6/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/.../lib/python3.6/site-packages/requests/adapters.py", line 499, in send
    raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))

.new() 方法中放置一个钩子可以让您为下一次尝试调整策略,并让您反省 .history 属性,但不会让您避免异常 re-raising.