为什么 PubSub 在确认后重新发送消息?

Why does PubSub resend messages when they were ack?

我有一个简短的项目,我在其中推送了一些消息 (~1000),然后我尝试在单个线程上处理它们,但我仍然收到重复消息。

这是 PubSub 所期望的行为吗?

这是创建订阅者的代码

    ExecutorProvider executorProvider =
            InstantiatingExecutorProvider.newBuilder().setExecutorThreadCount(1).build();

    // create subscriber
    subscriber = Subscriber.newBuilder(subscriptionName, messageReceiver).setExecutorProvider(executorProvider).build();
    subscriber.startAsync();

这是演示:https://github.com/andonescu/play-pubsub

我已经推送了 1000 条消息,每个进程花费了 300 毫秒(故意添加延迟),然后调用了 ack()。订阅的确认时间是 10。基于所有这些,我不应该收到重复的消息,但我已经收到了超过 10% 的已发送消息。

这是日志:https://github.com/andonescu/play-pubsub/blob/master/reports/1000-messages-reader-status

我在 https://github.com/GoogleCloudPlatform/pubsub/issues/182

上添加了同样的问题

仔细阅读 PubSub 文档,我发现了以下部分:

However, messages may sometimes be delivered out of order or more than once. In general, accommodating more-than-once delivery requires your subscriber to be idempotent when processing messages. You can achieve exactly once processing of Cloud Pub/Sub message streams using Cloud Dataflow PubsubIO. PubsubIO de-duplicates messages on custom message identifiers or those assigned by Cloud Pub/Sub.

https://cloud.google.com/pubsub/docs/subscriber#at-least-once-delivery

看来 Cloud Dataflow PubsubIO 是我的关键。

或使用 UniqueId 并在客户端中执行重复数据删除 :)