Apache Kafka 能否用于聚合每个客户的每小时消费数据?
Can Apache Kafka be used to aggregate hourly consumption data per customer?
例如每个客户的用电量。消费数据并非一直流式传输,而是按一定数量的客户在过去 12 小时内分批插入。插入后,我们需要汇总每个客户每小时的消费量,如果某些客户的前一小时消费量不存在,则要查找 "closest by date" 消费量。
您检查过 Kafka Streams (https://kafka.apache.org/documentation/streams/) 了吗?
它允许您将主题阅读为数据流并按时聚合windows:
StreamsBuilder builder = new StreamsBuilder();
builder.stream("topic-name")
.groupByKey() // assuming the key is a customer-ID
.windowedBy(TimeWindows.of(Duration.ofHours(1)))
.aggregate(...); // insert business logic here
例如每个客户的用电量。消费数据并非一直流式传输,而是按一定数量的客户在过去 12 小时内分批插入。插入后,我们需要汇总每个客户每小时的消费量,如果某些客户的前一小时消费量不存在,则要查找 "closest by date" 消费量。
您检查过 Kafka Streams (https://kafka.apache.org/documentation/streams/) 了吗?
它允许您将主题阅读为数据流并按时聚合windows:
StreamsBuilder builder = new StreamsBuilder();
builder.stream("topic-name")
.groupByKey() // assuming the key is a customer-ID
.windowedBy(TimeWindows.of(Duration.ofHours(1)))
.aggregate(...); // insert business logic here