Kafka 流状态存储分布

Kafka streams state store distribution

我有一个在多个实例上运行的 kafka 应用程序,我想使用状态存储来缓存一些数据字段。如果有多个应用程序实例,如果一个实例出现故障,一个实例的本地状态存储是否会复制到另一个实例?实例返回时会发生什么?状态存储如何连接到数据密钥以进行适当的重新分配?

if one instance goes down, does the local state store of one instance gets copied to other instance?

如果您没有备用副本,那么任务将从头开始读取更改日志主题以重建存储,有效地制作副本,是的。

In the docs,

Starting in 2.6, Kafka Streams will guarantee that a task is only ever assigned to an instance with a fully caught-up local copy of the state, if such an instance exists. Standby tasks will increase the likelihood that a caught-up instance exists in the case of a failure


How are the state stores connected to the data keys for proper redistribution?

分区映射到任务线程(参考同一页)。