从多线程驱动程序启动 Apache Spark SQL 个作业
Launching Apache Spark SQL jobs from multi-threaded driver
我想使用 Spark 从大约 1500 个远程 Oracle table 中提取数据,我想要一个多线程应用程序,每个线程获取一个 table 或者 10 tables 每个线程并启动一个 spark 作业来读取它们各自的 tables。
来自官方 spark 站点 https://spark.apache.org/docs/latest/job-scheduling.html 很明显这可以工作...
...cluster managers that Spark runs on provide facilities for scheduling across applications. Second, within each Spark application, multiple “jobs” (Spark actions) may be running concurrently if they were submitted by different threads. This is common if your application is serving requests over the network. Spark includes a fair scheduler to schedule resources within each SparkContext.
但是您可能已经注意到在这个 SO post Concurrent job Execution in Spark 中没有关于这个类似问题的公认答案并且最受支持的答案以
开头
This is not really in the spirit of Spark
- 大家都知道它不在 Spark "spirit"
- 谁在乎Spark的精神是什么?这实际上没有任何意义
有没有人以前做过这样的事情?你必须做任何特别的事情吗?在我浪费大量工作时间制作原型之前,只想得到一些建议。如果能提供任何帮助,我将不胜感激!
spark 上下文是线程安全的,因此可以从多个线程并行调用它。 (我正在制作中)
需要注意的一件事是限制线程的数量 运行,因为:
1. executor内存是所有线程共享的,可能会OOM或者不断的从缓存中换入换出内存
2. cpu 是有限的,所以任务多于核心不会有任何改善
您不需要在一个多线程应用程序中提交作业(尽管我确实认为您没有理由不这样做)。只需将您的作业作为单独的流程提交即可。有一个脚本可以一次提交所有这些作业并将进程推送到后台,或者以 yarn-cluster 模式提交。
您的调度程序(yarn、mesos、spark 集群)只会让您的一些作业等待,因为它没有空间让所有调度程序根据内存和/或 cpu 可用性同时 运行 .
请注意,如果您真正使用多个分区处理您的 table,我只会看到您的方法的好处 - 而不是像我多次看到的那样只有一个。另外,因为您需要处理那么多 table,我不确定您会受益多少(如果有的话)。根据您对 table 数据的处理方式,只有多个单线程和非火花作业 运行ning 可能更简单。
另见@cowbert 他的笔记。
同意@lev,我想了很久,所以我写了一个简单的小代码来确保它能正常工作,请注意!!为了控制每个驱动程序的工人数量,您需要使用 coalesce.
限制 dataframe/set
示例代码如下:
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
object SparkMultiThreadExample extends App{
val TOTAL_WORKERS = 10
val NUMBER_OF_WORKERS_PER_DRIVER = 2
val sparkConf = new SparkConf()
sparkConf.setMaster(s"local[${TOTAL_WORKERS}]")
val spark = SparkSession.builder().config(sparkConf).getOrCreate()
val list1 = (0 until 10).toList
import spark.implicits._
list1.par.foreach(t => {
spark.createDataset(list1).coalesce(NUMBER_OF_WORKERS_PER_DRIVER).foreach(i => {
println(s"${Thread.currentThread()}, Driver thread ${t}: This is inside worker ${i} " )
Thread.sleep(1000)
println(s"FINISH ${Thread.currentThread()} Driver thread ${t}: This is inside worker ${i} " )
})
}) }
输出:
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 0
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 0
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 5
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 5
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 5
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 0
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 0
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 5
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 5
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 0
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 0
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 5
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 0
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 5
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 5
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 6
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 1
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 6
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 1
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 6
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 0
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 5
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 1
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 5
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 0
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 6
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 6
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 0
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 1
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 1
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 6
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 1
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 6
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 2
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 7
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 1
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 7
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 6
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 2
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 7
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 1
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 2
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 6
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 7
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 6
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 7
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 1
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 2
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 1
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 2
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 2
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 7
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 2
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 7
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 7
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 8
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 3
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 8
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 3
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 8
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 2
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 3
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 7
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 7
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 8
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 8
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 2
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 2
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 3
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 3
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 8
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 3
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 3
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 8
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 4
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 9
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 4
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 9
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 8
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 9
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 3
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 4
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 8
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 8
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 9
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 3
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 3
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 9
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 4
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 4
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 4
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 4
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 9
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 9
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 9
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 4
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 9
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 9
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 4
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 4
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 5
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 0
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 0
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 5
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 0
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 5
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 0
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 5
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 5
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 0
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 5
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 6
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 0
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 1
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 0
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 1
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 5
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 6
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 0
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 1
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 5
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 6
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 0
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 1
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 5
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 6
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 5
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 6
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 0
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 1
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 6
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 7
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 1
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 2
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 1
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 2
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 6
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 7
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 1
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 2
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 6
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 7
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 1
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 2
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 6
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 7
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 6
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 7
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 1
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 2
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 7
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 8
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 2
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 3
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 2
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 3
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 7
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 8
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 2
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 3
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 7
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 8
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 2
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 3
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 7
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 8
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 7
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 8
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 2
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 3
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 8
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 9
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 3
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 4
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 3
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 4
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 8
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 9
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 3
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 4
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 8
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 9
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 3
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 4
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 8
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 9
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 8
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 9
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 3
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 4
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 9
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 4
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 4
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 9
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 4
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 9
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 4
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 9
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 9
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 4
我想使用 Spark 从大约 1500 个远程 Oracle table 中提取数据,我想要一个多线程应用程序,每个线程获取一个 table 或者 10 tables 每个线程并启动一个 spark 作业来读取它们各自的 tables。
来自官方 spark 站点 https://spark.apache.org/docs/latest/job-scheduling.html 很明显这可以工作...
...cluster managers that Spark runs on provide facilities for scheduling across applications. Second, within each Spark application, multiple “jobs” (Spark actions) may be running concurrently if they were submitted by different threads. This is common if your application is serving requests over the network. Spark includes a fair scheduler to schedule resources within each SparkContext.
但是您可能已经注意到在这个 SO post Concurrent job Execution in Spark 中没有关于这个类似问题的公认答案并且最受支持的答案以
开头This is not really in the spirit of Spark
- 大家都知道它不在 Spark "spirit"
- 谁在乎Spark的精神是什么?这实际上没有任何意义
有没有人以前做过这样的事情?你必须做任何特别的事情吗?在我浪费大量工作时间制作原型之前,只想得到一些建议。如果能提供任何帮助,我将不胜感激!
spark 上下文是线程安全的,因此可以从多个线程并行调用它。 (我正在制作中)
需要注意的一件事是限制线程的数量 运行,因为:
1. executor内存是所有线程共享的,可能会OOM或者不断的从缓存中换入换出内存
2. cpu 是有限的,所以任务多于核心不会有任何改善
您不需要在一个多线程应用程序中提交作业(尽管我确实认为您没有理由不这样做)。只需将您的作业作为单独的流程提交即可。有一个脚本可以一次提交所有这些作业并将进程推送到后台,或者以 yarn-cluster 模式提交。 您的调度程序(yarn、mesos、spark 集群)只会让您的一些作业等待,因为它没有空间让所有调度程序根据内存和/或 cpu 可用性同时 运行 .
请注意,如果您真正使用多个分区处理您的 table,我只会看到您的方法的好处 - 而不是像我多次看到的那样只有一个。另外,因为您需要处理那么多 table,我不确定您会受益多少(如果有的话)。根据您对 table 数据的处理方式,只有多个单线程和非火花作业 运行ning 可能更简单。
另见@cowbert 他的笔记。
同意@lev,我想了很久,所以我写了一个简单的小代码来确保它能正常工作,请注意!!为了控制每个驱动程序的工人数量,您需要使用 coalesce.
限制 dataframe/set示例代码如下:
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
object SparkMultiThreadExample extends App{
val TOTAL_WORKERS = 10
val NUMBER_OF_WORKERS_PER_DRIVER = 2
val sparkConf = new SparkConf()
sparkConf.setMaster(s"local[${TOTAL_WORKERS}]")
val spark = SparkSession.builder().config(sparkConf).getOrCreate()
val list1 = (0 until 10).toList
import spark.implicits._
list1.par.foreach(t => {
spark.createDataset(list1).coalesce(NUMBER_OF_WORKERS_PER_DRIVER).foreach(i => {
println(s"${Thread.currentThread()}, Driver thread ${t}: This is inside worker ${i} " )
Thread.sleep(1000)
println(s"FINISH ${Thread.currentThread()} Driver thread ${t}: This is inside worker ${i} " )
})
}) }
输出:
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 0
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 0
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 5
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 5
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 5
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 0
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 0
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 5
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 5
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 0
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 0
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 5
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 0
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 5
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 5
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 6
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 1
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 6
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 1
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 6
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 0
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 5
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 1
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 5
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 0
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 6
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 6
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 0
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 1
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 1
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 6
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 1
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 6
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 2
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 7
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 1
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 7
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 6
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 2
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 7
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 1
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 2
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 6
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 7
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 6
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 7
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 1
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 2
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 1
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 2
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 2
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 7
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 2
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 7
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 7
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 8
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 3
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 8
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 3
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 8
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 2
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 3
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 7
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 7
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 8
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 8
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 2
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 2
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 3
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 3
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 8
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 3
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 3
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 8
Thread[Executor task launch worker for task 0,5,main], Driver thread 0: This is inside worker 4
Thread[Executor task launch worker for task 3,5,main], Driver thread 2: This is inside worker 9
Thread[Executor task launch worker for task 4,5,main], Driver thread 3: This is inside worker 4
Thread[Executor task launch worker for task 7,5,main], Driver thread 5: This is inside worker 9
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 8
Thread[Executor task launch worker for task 1,5,main], Driver thread 0: This is inside worker 9
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 3
Thread[Executor task launch worker for task 2,5,main], Driver thread 2: This is inside worker 4
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 8
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 8
Thread[Executor task launch worker for task 9,5,main], Driver thread 4: This is inside worker 9
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 3
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 3
Thread[Executor task launch worker for task 5,5,main], Driver thread 3: This is inside worker 9
Thread[Executor task launch worker for task 8,5,main], Driver thread 4: This is inside worker 4
Thread[Executor task launch worker for task 6,5,main], Driver thread 5: This is inside worker 4
FINISH Thread[Executor task launch worker for task 0,5,main] Driver thread 0: This is inside worker 4
FINISH Thread[Executor task launch worker for task 4,5,main] Driver thread 3: This is inside worker 4
FINISH Thread[Executor task launch worker for task 3,5,main] Driver thread 2: This is inside worker 9
FINISH Thread[Executor task launch worker for task 7,5,main] Driver thread 5: This is inside worker 9
FINISH Thread[Executor task launch worker for task 1,5,main] Driver thread 0: This is inside worker 9
FINISH Thread[Executor task launch worker for task 2,5,main] Driver thread 2: This is inside worker 4
FINISH Thread[Executor task launch worker for task 9,5,main] Driver thread 4: This is inside worker 9
FINISH Thread[Executor task launch worker for task 5,5,main] Driver thread 3: This is inside worker 9
FINISH Thread[Executor task launch worker for task 6,5,main] Driver thread 5: This is inside worker 4
FINISH Thread[Executor task launch worker for task 8,5,main] Driver thread 4: This is inside worker 4
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 5
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 0
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 0
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 5
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 0
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 5
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 0
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 5
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 5
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 0
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 5
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 6
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 0
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 1
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 0
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 1
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 5
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 6
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 0
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 1
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 5
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 6
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 0
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 1
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 5
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 6
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 5
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 6
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 0
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 1
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 6
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 7
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 1
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 2
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 1
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 2
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 6
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 7
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 1
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 2
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 6
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 7
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 1
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 2
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 6
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 7
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 6
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 7
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 1
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 2
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 7
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 8
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 2
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 3
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 2
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 3
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 7
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 8
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 2
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 3
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 7
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 8
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 2
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 3
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 7
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 8
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 7
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 8
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 2
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 3
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 8
Thread[Executor task launch worker for task 11,5,main], Driver thread 7: This is inside worker 9
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 3
Thread[Executor task launch worker for task 10,5,main], Driver thread 7: This is inside worker 4
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 3
Thread[Executor task launch worker for task 12,5,main], Driver thread 6: This is inside worker 4
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 8
Thread[Executor task launch worker for task 13,5,main], Driver thread 6: This is inside worker 9
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 3
Thread[Executor task launch worker for task 14,5,main], Driver thread 1: This is inside worker 4
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 8
Thread[Executor task launch worker for task 15,5,main], Driver thread 1: This is inside worker 9
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 3
Thread[Executor task launch worker for task 16,5,main], Driver thread 8: This is inside worker 4
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 8
Thread[Executor task launch worker for task 17,5,main], Driver thread 8: This is inside worker 9
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 8
Thread[Executor task launch worker for task 19,5,main], Driver thread 9: This is inside worker 9
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 3
Thread[Executor task launch worker for task 18,5,main], Driver thread 9: This is inside worker 4
FINISH Thread[Executor task launch worker for task 11,5,main] Driver thread 7: This is inside worker 9
FINISH Thread[Executor task launch worker for task 10,5,main] Driver thread 7: This is inside worker 4
FINISH Thread[Executor task launch worker for task 12,5,main] Driver thread 6: This is inside worker 4
FINISH Thread[Executor task launch worker for task 13,5,main] Driver thread 6: This is inside worker 9
FINISH Thread[Executor task launch worker for task 14,5,main] Driver thread 1: This is inside worker 4
FINISH Thread[Executor task launch worker for task 15,5,main] Driver thread 1: This is inside worker 9
FINISH Thread[Executor task launch worker for task 16,5,main] Driver thread 8: This is inside worker 4
FINISH Thread[Executor task launch worker for task 17,5,main] Driver thread 8: This is inside worker 9
FINISH Thread[Executor task launch worker for task 19,5,main] Driver thread 9: This is inside worker 9
FINISH Thread[Executor task launch worker for task 18,5,main] Driver thread 9: This is inside worker 4