如何负载均衡kafka?

How to load balance kafka?

谁能帮我在kafka中做负载均衡?要执行什么逻辑?我认为部署多代理多节点 kafka 会解决问题吗?另外,如果有人可以指导我增加分区可能会影响 kafka 的负载平衡和吞吐量?

如果你的意思是扩展 Kafka 集群,你需要做的最低限度是:

  • 向集群添加更多代理
  • 重新平衡主题和分区

这里有描述:https://kafka.apache.org/documentation/#basic_ops_cluster_expansion

Adding servers to a Kafka cluster is easy, just assign them a unique broker id and start up Kafka on your new servers. However these new servers will not automatically be assigned any data partitions, so unless partitions are moved to them they won't be doing any work until new topics are created. So usually when you add machines to your cluster you will want to migrate some existing data to these machines.

一旦他们的分区被移动到新节点,消费者和生产者将自动重新平衡以使用新节点。

要了解消费者和生产者如何随分区数量扩展,我建议阅读 Kafka 关键概念:https://kafka.apache.org/documentation/#intro_concepts_and_terms

Topics are partitioned, meaning a topic is spread over a number of "buckets" located on different Kafka brokers. This distributed placement of your data is very important for scalability because it allows client applications to both read and write the data from/to many brokers at the same time. When a new event is published to a topic, it is actually appended to one of the topic's partitions. Events with the same event key (e.g., a customer or vehicle ID) are written to the same partition, and Kafka guarantees that any consumer of a given topic-partition will always read that partition's events in exactly the same order as they were written.