在哪里为特定的消费者群体定义消费者数量？

Question

我正在使用 Spark Streaming 来使用来自 Kafka 主题的数据。

如果我使用 DirectStream 方法，我没有定义 consumer group 和 number of consumers.

的选项

例如：

val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicsSet)

在哪里定义消费者组和该组的消费者数量？

如果我使用基于接收器的方法，我可以选择定义 consumer group 和 number of threads[该组的消费者数量]。

基于接收器的方法：

val topicMap = topics.split(",").map((_, numThreads.toInt)).toMap
val lines = KafkaUtils.createStream(ssc, zkQuorum, group, topicMap).map(_._2)

Answer 1

使用 Spark Streaming DirectStream 方法时没有消费者组概念。

根据 Spark 流文档

With directStream, Spark Streaming will create as many RDD partitions as there are Kafka partitions to consume, which will all read data from Kafka in parallel. So there is a one-to-one mapping between Kafka and RDD partitions

在哪里为特定的消费者群体定义消费者数量？

Where to define number of consumers for the specific consumer group?

scala

spark-streaming

kafka-consumer-api