Kafka Streams 1.1.0:消费者组重新处理整个日志

Kafka Streams 1.1.0: Consumer Group Reprocessing Entire Log

我们有一个 kafka 流应用程序 (2.0),它正在与 kafka 代理 (1.1.0) 通信。 Streams 应用程序一直在重新处理整个日志,原因不明 - 应用程序没有重新启动,没有重新平衡,只是闲置 - 在某些情况下它正在处理消息,在其他情况下它正在等待接收消息(不到 6 小时前处理过消息)。我们进行了大量研究,并通过将 offset-retention-minutes 设置为 1 周(与我们的消息保留时间相同)排除了 potential cause。此外,这将是消费者组偏移量在主动处理消息时被重置问题的根本原因。

事件发生时代理日志中没有任何有趣的内容:

[2019-02-21 09:02:20,009] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2019-02-21 09:12:20,009] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2019-02-21 09:12:51,084] INFO [ProducerStateManager partition=MY_TOPIC-1] Writing producer snapshot at offset 422924 (kafka.log.ProducerStateManager)
[2019-02-21 09:12:51,085] INFO [Log partition=MY_TOPIC-1, dir=/data1/kafka] Rolled new log segment at offset 422924 in 1 ms. (kafka.log.Log)
[2019-02-21 09:14:56,384] INFO [ProducerStateManager partition=MY_TOPIC-12] Writing producer snapshot at offset 295610 (kafka.log.ProducerStateManager)
[2019-02-21 09:14:56,384] INFO [Log partition=MY_TOPIC-12, dir=/data1/kafka] Rolled new log segment at offset 295610 in 1 ms. (kafka.log.Log)
[2019-02-21 09:15:19,365] INFO [ProducerStateManager partition=__transaction_state-8] Writing producer snapshot at offset 3939084 (kafka.log.ProducerStateManager)
[2019-02-21 09:15:19,365] INFO [Log partition=__transaction_state-8, dir=/data1/kafka] Rolled new log segment at offset 3939084 in 0 ms. (kafka.log.Log)
[2019-02-21 09:21:26,755] INFO [ProducerStateManager partition=MY_TOPIC-9] Writing producer snapshot at offset 319799 (kafka.log.ProducerStateManager)
[2019-02-21 09:21:26,755] INFO [Log partition=MY_TOPIC-9, dir=/data1/kafka] Rolled new log segment at offset 319799 in 1 ms. (kafka.log.Log)
[2019-02-21 09:22:20,009] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2019-02-21 09:23:31,283] INFO [ProducerStateManager partition=__consumer_offsets-17] Writing producer snapshot at offset 47345110 (kafka.log.ProducerStateManager)
[2019-02-21 09:23:31,297] INFO [Log partition=__consumer_offsets-17, dir=/data1/kafka] Rolled new log segment at offset 47345110 in 28 ms. (kafka.log.Log)

而且在应用程序日志中绝对没有(即使日志级别设置为DEBUG)。

关于可能导致此问题的原因有什么想法吗?

将 Kafka 代理升级到 2.0.0 解决了这个问题。