如何调试此堆栈跟踪? (google.auth.transport.grpc.AuthMetadataPlugin)

How do I debug this stack trace? (google.auth.transport.grpc.AuthMetadataPlugin)

我创建了一个函数:

我有大约 5% 的时间收到连接错误,似乎引用了 python 站点包与我的实际代码。我怎样才能继续调试这个问题?

我在从云存储读取的每一步都添加了重试,但这种失败似乎甚至在我的代码开始之前就发生了 运行。或者,日志没有进入 stackdriver?

这是完整的堆栈跟踪。我没有看到它在我的代码中引用了哪些行。

Function execution started
AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x7ea453f9e780>" raised exception!
Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
    chunked=chunked,
  File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1016, in _send_output
    self.send(msg)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 977, in send
    self.sock.sendall(data)
ConnectionResetError: [Errno 104] Connection reset by peer
None
During handling of the above exception, another exception occurred:
None
Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/env/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 400, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/env/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
    raise value.with_traceback(tb)
  File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
    chunked=chunked,
  File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 1016, in _send_output
    self.send(msg)
  File "/opt/python3.7/lib/python3.7/http/client.py", line 977, in send
    self.sock.sendall(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
None
During handling of the above exception, another exception occurred:
None
Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 123, in __call__
    method, url, data=body, headers=headers, timeout=timeout, **kwargs
  File "/env/local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/env/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/env/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
None
The above exception was the direct cause of the following exception:
None
Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 96, in refresh
    self._retrieve_info(request)
  File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 77, in _retrieve_info
    request, service_account=self._service_account_email
  File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/_metadata.py", line 200, in get_service_account_info
    recursive=True,
  File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/_metadata.py", line 132, in get
    response = request(url=url, method="GET", headers=_METADATA_HEADERS)
  File "/env/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 128, in __call__
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.TransportError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
None
The above exception was the direct cause of the following exception:
None
Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/grpc/_plugin_wrapping.py", line 79, in __call__
    callback_state, callback))
  File "/env/local/lib/python3.7/site-packages/google/auth/transport/grpc.py", line 77, in __call__
    callback(self._get_authorization_headers(context), None)
  File "/env/local/lib/python3.7/site-packages/google/auth/transport/grpc.py", line 64, in _get_authorization_headers
    self._request, context.method_name, context.service_url, headers
  File "/env/local/lib/python3.7/site-packages/google/auth/credentials.py", line 124, in before_request
    self.refresh(request)
  File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 102, in refresh
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.RefreshError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

我认为问题出在 blob.download_as_string() 出现连接错误。

但是,在部署函数的简化版本后,我无法重现错误。

This thread 说要添加 ConnectionResetError 和 ProtocolError 作为异常,它们也会被重试。

from urllib3.exceptions import ProtocolError
from google.api_core import retry

predicate = retry.if_exception_type(
    ConnectionResetError, ProtocolError)
reset_retry = retry.Retry(predicate)

data = reset_retry(blob.download_as_string)()

我希望我知道为什么这个连接错误经常发生。

我发现了这个间歇性错误的原因。

GCP 最佳实践建议在 main() 之外的 main.py 中实例化客户端连接。这些仅在实例冷启动时执行。

例如:

[main.py] - 仅在冷启动期间实例化客户端

import builtins
from google.cloud import storage
from google.cloud import pubsub_v1
from google.cloud import logging as cloudlogging

# Create global clients to avoid unneeded network activity!
builtins.pubsub_client = pubsub_v1.PublisherClient()
builtins.storage_client = storage.Client()
builtins.log_client = cloudlogging.Client()

[other_func.py] - 使用客户端

bucket = storage_client.create_bucket(bucket_name)

Examples relevant to networking

Examples relevant to logging