"spark.streaming.blockInterval" 在 Spark Streaming DirectAPI 中有什么用

What is use of "spark.streaming.blockInterval" in Spark Streaming DirectAPI

apache-kafka
apache-spark
spark-streaming
kafka-consumer-api

我想了解，"spark.streaming.blockInterval" 在 Spark Streaming DirectAPI 中扮演什么角色，根据我的理解 "spark.streaming.blockInterval" 用于计算分区，即 #partitions = (receivers x* batchInterval) /blockInterval，但在 DirectAPI spark streaming 分区中等于没有。卡夫卡分区。

如何在 DirectAPI 中使用 "spark.streaming.blockInterval"？

spark.streaming.blockInterval :

Interval at which data received by Spark Streaming receivers is chunked into blocks of data before storing them in Spark.

并且KafkaUtils.createDirectStream()不使用接收器。

With directStream, Spark Streaming will create as many RDD partitions as there are Kafka partitions to consume

"spark.streaming.blockInterval" 在 Spark Streaming DirectAPI 中有什么用

What is use of "spark.streaming.blockInterval" in Spark Streaming DirectAPI

apache-kafka

apache-spark

spark-streaming

kafka-consumer-api