请求 Google Cloud ML 超时

Question

我正在执行从 Google App Engine 到 Google Cloud ML（我没有创建模型）的请求（在线预测），但有时我会遇到异常 "Deadline exceeded while waiting for HTTP response from URL" 完整跟踪：

    Deadline exceeded while waiting for HTTP response from URL: https://ml.googleapis.com/v1/projects/project-id/models/my-model/versions/v3:predict?alt=json (/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py:1552)
Traceback (most recent call last):
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/main.py", line 90, in post
    response = predict(batch_obj=batch_data_obj)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/run_cloud_predict.py", line 88, in predict
    response = request.execute()
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/oauth2client/util.py", line 135, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/googleapiclient/http.py", line 835, in execute
    method=str(self.method), body=self.body, headers=self.headers)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/googleapiclient/http.py", line 162, in _retry_request
    resp, content = http.request(uri, method, *args, **kwargs)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/oauth2client/client.py", line 631, in new_request
    redirections, connection_type)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/httplib2/__init__.py", line 1659, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/httplib2/__init__.py", line 1399, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/base/data/home/apps/s~project-id/1.402312581449917691/lib/httplib2/__init__.py", line 1355, in _conn_request
    response = conn.getresponse()
  File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/gae_override/httplib.py", line 526, in getresponse
    raise HTTPException(str(e))
HTTPException: Deadline exceeded while waiting for HTTP response from URL: https://ml.googleapis.com/v1/projects/project-id/models/my-model/versions/v3:predict?alt=json

现在我知道 Google App Engine 有 60 秒的响应限制，这就是我使用任务队列执行请求的原因。我还尝试了以下操作：

URLFETCH_DEADLINE = 3600
urlfetch.set_default_fetch_deadline(URLFETCH_DEADLINE)
socket.setdefaulttimeout(URLFETCH_DEADLINE)

我正在构建这样的 api 客户端

import httplib2
from googleapiclient import discovery
from oauth2client import service_account

credentials = service_account.ServiceAccountCredentials.from_json_keyfile_name('credentials-file', scopes)
http = httplib2.Http(timeout=36000)
http = credentials.authorize(http)

ml = discovery.build('ml', 'v1', http=http)
request = ml.projects().predict(name=predict_ver_name, body=request_data)

有趣的是，超时有时会发生在 70 秒左右（69.9、70、70.1 等），有时会发生在 120 秒左右（119.8、120.1 等），这告诉我这可能需要对一些内部 Cloud ML 处理线做更多的事情。我正在通过任务队列并行执行几十个请求。成功的响应时间从几秒到 ~110 秒我只是好奇是否有人有类似的经验或者可以给我建议如何解决这个问题，即导致截止日期的原因。

Answer 1

感谢您发布您的体验。 - 有一些启动成本，并且根据请求率，可能需要启动多个服务器来满足需求。 - 您要预测的模型的大小是多少？较大的模型往往具有较高的启动成本。

谢谢。

Answer 2

您可以使用下面的代码在 api 客户端轻松设置超时。

import socket
timeout_in_sec = 60*3 # 3 minutes timeout limit
socket.setdefaulttimeout(timeout_in_sec)

然后您可以像往常一样创建您的 ML 服务对象，它将具有延长的超时限制。

ml_service = discovery.build('ml', 'v1')

请求 Google Cloud ML 超时

Requests to Google Cloud ML timeout

google-app-engine

google-api-python-client

google-cloud-ml