如果会话 window 在保留期之前结束并且不活动间隙在保留期之后结束,会发生什么情况?

What happens if session window ends before retention period and inactivity gap ends after retention period?

这是使用 Kafka Streams 的简单会话 window:

stream
  .groupBy()
  .windowedBy(SessionWindows.with(Duration.ofMinutes(30)).grace(Duration.ofMinutes(0)))
  .aggregate(...) // implementation of aggregate function

使用下面的代码,我们可以配置状态存储:

Materialized
  .as(Stores.persistentSessionStore(storeName, Duration.ofHours(2))
  .withCachingEnabled()
  .withLoggingEnabled()
  .withKeySerde(keySerde)
  .withValueSerde(valueSerde)

Documentation 状态:

Note that the retention period must be at least long enough to contain the windowed data's entire life cycle, from window-start through window-end, and for the entire grace period.

我们不适用宽限期。但请考虑这种情况:会话 window 在保留期之前结束,但不活动间隙在保留期之后结束。我想知道,会话数据是否有可能丢失?清理的应用有多积极?

似乎是 TimeWindowed 存储的 c&p 错误。

对比代码:

https://github.com/apache/kafka/blob/2.3/streams/src/main/java/org/apache/kafka/streams/kstream/internals/SessionWindowedKStreamImpl.java#L186

https://github.com/apache/kafka/blob/2.3/streams/src/main/java/org/apache/kafka/streams/kstream/internals/TimeWindowedKStreamImpl.java#L166

我创建了一个 JIRA 票证来修复它:https://issues.apache.org/jira/browse/KAFKA-9068