GCP 文档 - 任务队列 bucket_size 和速率

GCP Documentation - Task Queue bucket_size and rate

我在这里阅读了很多关于 Google 任务的文章和答案，我的疑问是 "rate" 和 "bucket_size" 行为。

我阅读了这份文档： https://cloud.google.com/appengine/docs/standard/java/configyaml/queue

片段是：

Configuring the maximum number of concurrent requests

If using the default max_concurrent_requests settings are not sufficient, you can change the settings for max_concurrent_requests, as shown in the following example:

If your application queue has a rate of 20/s and a bucket size of 40, tasks in that queue execute at a rate of 20/s and can burst up to 40/s briefly. These settings work fine if task latency is relatively low; however, if latency increases significantly, you'll end up processing significantly more concurrent tasks. This extra processing load can consume extra instances and slow down your application.

For example, let's assume that your normal task latency is 0.3 seconds. At this latency, you'll process at most around 40 tasks simultaneously. But if your task latency increases to 5 seconds, you could easily have over 100 tasks processing at once. This increase forces your application to consume more instances to process the extra tasks, potentially slowing down the entire application and interfering with user requests.

You can avoid this possibility by setting max_concurrent_requests to a lower value. For example, if you set max_concurrent_requests to 10, our example queue maintains about 20 tasks/second when latency is 0.3 seconds. However, when the latency increases over 0.5 seconds, this setting throttles the processing rate to ensure that no more than 10 tasks run simultaneously.

queue:

# Set the max number of concurrent requests to 50

- name: optimize-queue
rate: 20/s
bucket_size: 40
max_concurrent_requests: 10

我理解队列是这样工作的：

桶是决定执行任务数量的单位。

速率是每个周期要执行的桶填充量。

max_concurrent_requests是同时执行的最大值。

这里的这段代码可能很奇怪：

But if your task latency increases to 5 seconds, you could easily have over 100 tasks processing at once. This increase forces your application to consume more instances to process the extra tasks, potentially slowing down the entire application and interfering with user requests.

假设未设置 max_concurrent_requests。对我来说，不可能执行超过 100 个任务，因为 bucket_size 是 40。对我来说，低任务会影响任务等待空桶的时间。

为什么文档说任务可以超过 100 个？

如果bucket是40个，可以同时超过40个运行吗？

编辑

刚刚执行完所有任务，桶就装满了，或者如果某个桶空闲，下一个速率会增加吗？例子： 40 个桶正在执行。 1桶完成。想象一下，每个桶花费的时间超过 0.5 秒，而某些桶花费的时间超过 1 秒。当 1 个桶空闲时，这将在下一秒填满，或者桶等待所有任务完成后再再次填满？

桶大小的定义更精确 in the doc 您 link，但可以将其视为一种初始突发限制。

根据您在问题中提供的参数，我的理解是这样的：

bucket_size: 40
速率：20/s
max_concurrent_requests: 10

第一秒 (t1) 将开始处理 40 个任务。同时将 20 个令牌（基于 rate）添加到桶中。因此，在 t2 时，将准备 20 个任务进行处理，另外 20 个令牌将添加到桶中。

如果没有max_concurrent_setting，那20个任务就会开始处理。如果 max_concurrent_setting 为 10，则不会发生任何事情，因为已经有超过 10 个进程在使用。

App Engine 将继续以 20/s 的速度向桶中添加令牌，但前提是桶中有空间 (bucket_size)。一旦桶中有 40 个令牌，它将停止，直到某些运行进程完成并且有更多空间。

最初的 40 个任务完成后，一次执行的任务绝不会超过 10 个。

GCP 文档 - 任务队列 bucket_size 和速率

GCP Documentation - Task Queue bucket_size and rate

google-app-engine

google-cloud-platform

gcp

编辑