为什么 PubSub 在确认后重新发送消息?
Why does PubSub resend messages when they were ack?
我有一个简短的项目,我在其中推送了一些消息 (~1000),然后我尝试在单个线程上处理它们,但我仍然收到重复消息。
这是 PubSub 所期望的行为吗?
这是创建订阅者的代码
ExecutorProvider executorProvider =
InstantiatingExecutorProvider.newBuilder().setExecutorThreadCount(1).build();
// create subscriber
subscriber = Subscriber.newBuilder(subscriptionName, messageReceiver).setExecutorProvider(executorProvider).build();
subscriber.startAsync();
这是演示:https://github.com/andonescu/play-pubsub
我已经推送了 1000 条消息,每个进程花费了 300 毫秒(故意添加延迟),然后调用了 ack()。订阅的确认时间是 10。基于所有这些,我不应该收到重复的消息,但我已经收到了超过 10% 的已发送消息。
这是日志:https://github.com/andonescu/play-pubsub/blob/master/reports/1000-messages-reader-status
上添加了同样的问题
仔细阅读 PubSub 文档,我发现了以下部分:
However, messages may sometimes be delivered out of order or more than once. In general, accommodating more-than-once delivery requires your subscriber to be idempotent when processing messages. You can achieve exactly once processing of Cloud Pub/Sub message streams using Cloud Dataflow PubsubIO. PubsubIO de-duplicates messages on custom message identifiers or those assigned by Cloud Pub/Sub.
https://cloud.google.com/pubsub/docs/subscriber#at-least-once-delivery
看来 Cloud Dataflow PubsubIO
是我的关键。
或使用 UniqueId
并在客户端中执行重复数据删除 :)
我有一个简短的项目,我在其中推送了一些消息 (~1000),然后我尝试在单个线程上处理它们,但我仍然收到重复消息。
这是 PubSub 所期望的行为吗?
这是创建订阅者的代码
ExecutorProvider executorProvider =
InstantiatingExecutorProvider.newBuilder().setExecutorThreadCount(1).build();
// create subscriber
subscriber = Subscriber.newBuilder(subscriptionName, messageReceiver).setExecutorProvider(executorProvider).build();
subscriber.startAsync();
这是演示:https://github.com/andonescu/play-pubsub
我已经推送了 1000 条消息,每个进程花费了 300 毫秒(故意添加延迟),然后调用了 ack()。订阅的确认时间是 10。基于所有这些,我不应该收到重复的消息,但我已经收到了超过 10% 的已发送消息。
这是日志:https://github.com/andonescu/play-pubsub/blob/master/reports/1000-messages-reader-status
上添加了同样的问题仔细阅读 PubSub 文档,我发现了以下部分:
However, messages may sometimes be delivered out of order or more than once. In general, accommodating more-than-once delivery requires your subscriber to be idempotent when processing messages. You can achieve exactly once processing of Cloud Pub/Sub message streams using Cloud Dataflow PubsubIO. PubsubIO de-duplicates messages on custom message identifiers or those assigned by Cloud Pub/Sub.
https://cloud.google.com/pubsub/docs/subscriber#at-least-once-delivery
看来 Cloud Dataflow PubsubIO
是我的关键。
或使用 UniqueId
并在客户端中执行重复数据删除 :)