Google Cloud Tasks 'Create Task' 请求正在抛出 ServiceUnavailable:503
Google Cloud Tasks 'Create Task' request is throwing ServiceUnavailable: 503
我正在将我的任务从 AppEngine TaskQueues 转换为 Google Cloud Tasks。
有问题的是每小时检查 S3 存储桶中是否有新文件的 cron 作业。 cron 作业为找到的每个文件启动一个新任务。然后,这些任务会下载各自的文件,并针对文件中的每条记录启动一个新任务。
在此扇出过程中,对 create_task()
的某些调用似乎因 ServiceUnavailable: 503 (https://googleapis.dev/python/cloudtasks/latest/gapic/v2/api.html#google.cloud.tasks_v2.CloudTasksClient.create_task)
而失败
这是一个
Traceback (most recent call last):
...
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/src/utils/gc_tasks.py", line 72, in _gc_create_task
_ = _tasks_client.create_task(parent=_queue_path(DEFAULT_QUEUE), task=task)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/cloud/tasks_v2/gapic/cloud_tasks_client.py", line 1512, in create_task
request, retry=retry, timeout=timeout, metadata=metadata
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/gapic_v1/method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 273, in retry_wrapped_func
on_error=on_error,
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 182, in retry_target
return target()
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/timeout.py", line 214, in func_with_timeout
return func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/ebb3af67a06047b6/python27/python27_lib/versions/third_party/six-1.12.0/six/__init__.py", line 737, in raise_from
raise value
ServiceUnavailable: 503 {
"created":"@1583436423.131570193",
"description":"Delayed close due to in-progress write",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/ext/transport/chttp2/transport/chttp2_transport.c",
"file_line":412,
"grpc_status":14,
"referenced_errors":[{
"created":"@1583436423.131561040",
"description":"OS Error",
"errno":32,
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/lib/iomgr/tcp_posix.c",
"file_line":393,
"os_error":"Broken pipe",
"syscall":"sendmsg"}
]}
这是另一个
Traceback (most recent call last):
...
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/src/utils/pt_gc_tasks.py", line 72, in _gc_create_task
_ = _tasks_client.create_task(parent=_queue_path(DEFAULT_QUEUE), task=task)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/cloud/tasks_v2/gapic/cloud_tasks_client.py", line 1512, in create_task
request, retry=retry, timeout=timeout, metadata=metadata
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/gapic_v1/method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 273, in retry_wrapped_func
on_error=on_error,
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 182, in retry_target
return target()
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/timeout.py", line 214, in func_with_timeout
return func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/ebb3af67a06047b6/python27/python27_lib/versions/third_party/six-1.12.0/six/__init__.py", line 737, in raise_from
raise value
ServiceUnavailable: 503 {
"created":"@1583407622.505288938",
"description":"Endpoint read failed",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/ext/transport/chttp2/transport/chttp2_transport.c",
"file_line":1807,
"grpc_status":14,
"occurred_during_write":0,
"referenced_errors":[{
"created":"@1583407622.505108366",
"description":"Secure read failed",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/lib/security/transport/secure_endpoint.c",
"file_line":158,
"referenced_errors":[{
"created":"@1583407622.505106550",
"description":"Socket closed",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/lib/iomgr/tcp_posix.c",
"file_line":259}
]}
]}
我是否同时排队太多任务?我能做些什么来解决这个问题?
您分享的两个错误的原因似乎与其描述中的文本不同,但两者确实都可能与您队列中的任务过载有关。
您可以采取的解决方法是设置一些速率限制以降低负载,或者您可以设置重试参数,因为显然它只发生在少数任务中。无论您选择哪种方式,您都可以在 Cloud Task Configuring Queue Documentation.
中找到方法
HTTP Error 503. The service is unavailable 是对应的Wep Application的Application Pool被Stopped或Disabled或Paused时出现的。或由于密码过期或锁定,应用程序池的给定用户身份可能无效。
经过大量挖掘,“503 服务不可用”似乎是所有 GCP 服务的 google-cloud SDK 中一个非常常见的错误。
- Why do I get 503 Service Unavailable errors using the Google Cloud Datastore API explorer?
- https://github.com/googleapis/google-cloud-python/issues/3128
解决方案是启用重试逻辑。 google-cloud-core
(google-cloud-tasks
依赖)有一个现有的重试机制,但没有为任务创建配置。
retry_codes_name
被设置为 non_idempotent
而不是 idempotent
"CreateTask": {
"timeout_millis": 10000,
"retry_codes_name": "non_idempotent",
"retry_params_name": "default",
},
我的猜测是这可能会导致重复的任务排队。但是,如果您指定了一个任务名称,google-cloud-tasks
应该可以防止这些重复项被排队。
所以我将一个 Retry
对象传递给 .create_task()
而没有为 predicate
提供 arg,这导致它默认为 if_transient_error()
它将重试以下错误: exceptions.InternalServerError
, exceptions.TooManyRequests
, exceptions.ServiceUnavailable
下面是我创建任务的代码片段
from google.api_core import retry
from google.api_core.exceptions import AlreadyExists
from google.cloud import tasks
_tasks_client = tasks.CloudTasksClient()
def my_create_task_function(my_queue_path, task_object):
try:
_tasks_client.create_task(
parent=my_queue_path,
task=task_object,
retry=retry.Retry( # Copies the default retry config from retry_params in google.cloud.tasks_v2.gapic.cloud_tasks_client_config
initial=.1,
maximum=60,
multiplier=1.3,
deadline=600))
except AlreadyExists:
logging.warn("found existing task")
还有一个可用的记录器,您可以调整其级别,以便您可以查看实际重试时的日志语句。
如果您执行以下操作:
logging.getLogger('google.api_core.retry').setLevel(logging.DEBUG)
当它启动时,您应该会在日志中看到这样的消息:
我正在将我的任务从 AppEngine TaskQueues 转换为 Google Cloud Tasks。
有问题的是每小时检查 S3 存储桶中是否有新文件的 cron 作业。 cron 作业为找到的每个文件启动一个新任务。然后,这些任务会下载各自的文件,并针对文件中的每条记录启动一个新任务。
在此扇出过程中,对 create_task()
的某些调用似乎因 ServiceUnavailable: 503 (https://googleapis.dev/python/cloudtasks/latest/gapic/v2/api.html#google.cloud.tasks_v2.CloudTasksClient.create_task)
这是一个
Traceback (most recent call last):
...
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/src/utils/gc_tasks.py", line 72, in _gc_create_task
_ = _tasks_client.create_task(parent=_queue_path(DEFAULT_QUEUE), task=task)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/cloud/tasks_v2/gapic/cloud_tasks_client.py", line 1512, in create_task
request, retry=retry, timeout=timeout, metadata=metadata
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/gapic_v1/method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 273, in retry_wrapped_func
on_error=on_error,
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 182, in retry_target
return target()
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/timeout.py", line 214, in func_with_timeout
return func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/ebb3af67a06047b6/python27/python27_lib/versions/third_party/six-1.12.0/six/__init__.py", line 737, in raise_from
raise value
ServiceUnavailable: 503 {
"created":"@1583436423.131570193",
"description":"Delayed close due to in-progress write",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/ext/transport/chttp2/transport/chttp2_transport.c",
"file_line":412,
"grpc_status":14,
"referenced_errors":[{
"created":"@1583436423.131561040",
"description":"OS Error",
"errno":32,
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/lib/iomgr/tcp_posix.c",
"file_line":393,
"os_error":"Broken pipe",
"syscall":"sendmsg"}
]}
这是另一个
Traceback (most recent call last):
...
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/src/utils/pt_gc_tasks.py", line 72, in _gc_create_task
_ = _tasks_client.create_task(parent=_queue_path(DEFAULT_QUEUE), task=task)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/cloud/tasks_v2/gapic/cloud_tasks_client.py", line 1512, in create_task
request, retry=retry, timeout=timeout, metadata=metadata
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/gapic_v1/method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 273, in retry_wrapped_func
on_error=on_error,
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/retry.py", line 182, in retry_target
return target()
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/timeout.py", line 214, in func_with_timeout
return func(*args, **kwargs)
File "/base/data/home/apps/s~my_project/dev.XXXXXXXXXXXXXXXXXXX/lib/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/ebb3af67a06047b6/python27/python27_lib/versions/third_party/six-1.12.0/six/__init__.py", line 737, in raise_from
raise value
ServiceUnavailable: 503 {
"created":"@1583407622.505288938",
"description":"Endpoint read failed",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/ext/transport/chttp2/transport/chttp2_transport.c",
"file_line":1807,
"grpc_status":14,
"occurred_during_write":0,
"referenced_errors":[{
"created":"@1583407622.505108366",
"description":"Secure read failed",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/lib/security/transport/secure_endpoint.c",
"file_line":158,
"referenced_errors":[{
"created":"@1583407622.505106550",
"description":"Socket closed",
"file":"third_party/apphosting/python/grpcio/v1_0_0/src/core/lib/iomgr/tcp_posix.c",
"file_line":259}
]}
]}
我是否同时排队太多任务?我能做些什么来解决这个问题?
您分享的两个错误的原因似乎与其描述中的文本不同,但两者确实都可能与您队列中的任务过载有关。
您可以采取的解决方法是设置一些速率限制以降低负载,或者您可以设置重试参数,因为显然它只发生在少数任务中。无论您选择哪种方式,您都可以在 Cloud Task Configuring Queue Documentation.
中找到方法HTTP Error 503. The service is unavailable 是对应的Wep Application的Application Pool被Stopped或Disabled或Paused时出现的。或由于密码过期或锁定,应用程序池的给定用户身份可能无效。
经过大量挖掘,“503 服务不可用”似乎是所有 GCP 服务的 google-cloud SDK 中一个非常常见的错误。
- Why do I get 503 Service Unavailable errors using the Google Cloud Datastore API explorer?
- https://github.com/googleapis/google-cloud-python/issues/3128
解决方案是启用重试逻辑。 google-cloud-core
(google-cloud-tasks
依赖)有一个现有的重试机制,但没有为任务创建配置。
retry_codes_name
被设置为 non_idempotent
而不是 idempotent
"CreateTask": {
"timeout_millis": 10000,
"retry_codes_name": "non_idempotent",
"retry_params_name": "default",
},
我的猜测是这可能会导致重复的任务排队。但是,如果您指定了一个任务名称,google-cloud-tasks
应该可以防止这些重复项被排队。
所以我将一个 Retry
对象传递给 .create_task()
而没有为 predicate
提供 arg,这导致它默认为 if_transient_error()
它将重试以下错误: exceptions.InternalServerError
, exceptions.TooManyRequests
, exceptions.ServiceUnavailable
下面是我创建任务的代码片段
from google.api_core import retry
from google.api_core.exceptions import AlreadyExists
from google.cloud import tasks
_tasks_client = tasks.CloudTasksClient()
def my_create_task_function(my_queue_path, task_object):
try:
_tasks_client.create_task(
parent=my_queue_path,
task=task_object,
retry=retry.Retry( # Copies the default retry config from retry_params in google.cloud.tasks_v2.gapic.cloud_tasks_client_config
initial=.1,
maximum=60,
multiplier=1.3,
deadline=600))
except AlreadyExists:
logging.warn("found existing task")
还有一个可用的记录器,您可以调整其级别,以便您可以查看实际重试时的日志语句。
如果您执行以下操作:
logging.getLogger('google.api_core.retry').setLevel(logging.DEBUG)
当它启动时,您应该会在日志中看到这样的消息: