Kafka 流记录在 windowing/aggregation 后未转发
Kafka streams records not forwarding after windowing/aggregation
我正在使用 Kafka Streams 和 Tumbling Window,然后是聚合步骤。但是观察到发出到聚合函数的元组数量正在下降。知道我哪里出错了吗?
代码:
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "events_streams_local");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.METRIC_REPORTER_CLASSES_CONFIG, Arrays.asList(JmxReporter.class));
props.put(StreamsConfig.STATE_DIR_CONFIG, "/tmp/kafka-streams/data/");
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 20);
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 60000);
props.put(StreamsConfig.DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG, EventTimeExtractor.class);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
final StreamsBuilder builder = new StreamsBuilder();
HashGenerator hashGenerator = new HashGenerator(1);
builder
.stream(inputTopics)
.mapValues((key, value) -> {
stats.incrInputRecords();
Event event = jsonUtil.fromJson((String) value, Event.class);
return event;
})
.filter(new UnifiedGAPingEventFilter(stats))
.selectKey(new KeyValueMapper<Object, Event, String>() {
@Override
public String apply(Object key, Event event) {
return (String) key;
}
})
.groupByKey(Grouped.with(Serdes.String(), eventSerdes))
.windowedBy(TimeWindows.of(Duration.ofSeconds(30)))
.aggregate(new AggregateInitializer(), new UserStreamAggregator(), Materialized.with(Serdes.String(), aggrSerdes))
.mapValues((k, v) -> {
// update counter for aggregate records
return v;
})
.toStream()
.map(new RedisSink(stats));
topology = builder.build();
streams = new KafkaStreams(topology, props);
Redis 每秒操作数刚刚下滑。
Kafka Streams 使用状态存储中的缓存来减少下游负载。如果您想将商店的每个更新作为下游记录,您可以通过 StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG
(所有商店全局设置)或通过将 Materialized.as(...).withCachingDisabled()
传递给相应的操作员(例如,aggregate()
).
查看文档了解更多详情:https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html
我正在使用 Kafka Streams 和 Tumbling Window,然后是聚合步骤。但是观察到发出到聚合函数的元组数量正在下降。知道我哪里出错了吗?
代码:
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "events_streams_local");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.METRIC_REPORTER_CLASSES_CONFIG, Arrays.asList(JmxReporter.class));
props.put(StreamsConfig.STATE_DIR_CONFIG, "/tmp/kafka-streams/data/");
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 20);
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 60000);
props.put(StreamsConfig.DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG, EventTimeExtractor.class);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
final StreamsBuilder builder = new StreamsBuilder();
HashGenerator hashGenerator = new HashGenerator(1);
builder
.stream(inputTopics)
.mapValues((key, value) -> {
stats.incrInputRecords();
Event event = jsonUtil.fromJson((String) value, Event.class);
return event;
})
.filter(new UnifiedGAPingEventFilter(stats))
.selectKey(new KeyValueMapper<Object, Event, String>() {
@Override
public String apply(Object key, Event event) {
return (String) key;
}
})
.groupByKey(Grouped.with(Serdes.String(), eventSerdes))
.windowedBy(TimeWindows.of(Duration.ofSeconds(30)))
.aggregate(new AggregateInitializer(), new UserStreamAggregator(), Materialized.with(Serdes.String(), aggrSerdes))
.mapValues((k, v) -> {
// update counter for aggregate records
return v;
})
.toStream()
.map(new RedisSink(stats));
topology = builder.build();
streams = new KafkaStreams(topology, props);
Redis 每秒操作数刚刚下滑。
Kafka Streams 使用状态存储中的缓存来减少下游负载。如果您想将商店的每个更新作为下游记录,您可以通过 StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG
(所有商店全局设置)或通过将 Materialized.as(...).withCachingDisabled()
传递给相应的操作员(例如,aggregate()
).
查看文档了解更多详情:https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html