多个 Spark 应用程序同时存在，相同的 Jarfile ... 作业处于等待状态

Question

Spark/Scala 菜鸟在这里。

我在集群环境中运行ning spark。我有两个非常相似的应用程序（每个应用程序都有独特的 spark 配置和上下文）。当我尝试将它们都踢掉时，第一个似乎会抢走所有资源，而第二个会等待抢夺资源。我在提交时设置资源，但这似乎并不重要。每个节点有 24 个内核和 45 GB 内存可供使用。这是我用来提交我想并行运行的两个命令。

./bin/spark-submit --master spark://MASTER:6066 --class MainAggregator --conf spark.driver.memory=10g --conf spark.executor.memory=10g --executor-cores 3 --num-executors 5 sparkapp_2.11-0.1.jar -new

./bin/spark-submit --master spark://MASTER:6066 --class BackAggregator --conf spark.driver.memory=5g --conf spark.executor.memory=5g --executor-cores 3 --num-executors 5 sparkapp_2.11-0.1.jar 01/22/2020 01/23/2020

我还应该注意到第二个应用程序确实启动了，但在主监控网页中我看到它是 "Waiting" 并且在第一个应用程序完成之前它将有 0 个核心。这些应用程序确实从相同的表中提取，但它们提取的数据块会有很大不同，因此 RDD/Dataframes 是唯一的，如果这有所不同的话。

为了同时运行这些我错过了什么？

Answer 1

second App does kick off but in the master monitoring webpage I see it as "Waiting" and it will have 0 cores until the first is done.

我前段时间遇到了同样的事情。这里有两件事..

可能是这些原因。

1) 您没有合适的基础设施。

2) 您可能使用过容量调度程序，它没有抢占机制来适应新作业。

如果它是#1，那么你必须增加更多的节点，使用你的 spark-submit 分配更多的资源。

如果是#2 那么你可以采用 hadoop 公平调度，在那里你可以维护 2 个池 see spark documentation on this 优势是你可以运行 parllel jobs Fair 会通过抢占一些资源来照顾并分配给另一个运行并发的工作。

mainpool 第一个 spark 作业..
backlogpool 到运行您的第二个 Spark 作业。

要实现此目的，您需要具有像这样的 xml 池配置示例池配置：

<pool name="default">
    <schedulingMode>FAIR</schedulingMode>
    <weight>3</weight>
    <minShare>3</minShare>
</pool>
<pool name="mainpool">
    <schedulingMode>FAIR</schedulingMode>
    <weight>3</weight>
    <minShare>3</minShare>
</pool>
<pool name="backlogpool">
    <schedulingMode>FAIR</schedulingMode>
    <weight>3</weight>
    <minShare>3</minShare>
</pool>

除此之外，您还需要在驱动程序代码中做一些更小的更改，例如第一个作业应该转到哪个池以及第二个作业应该转到哪个池。

工作原理：

更多详情请看我的文章..

hadoop-yarn-fair-schedular-advantages-explained-part1

hadoop-yarn-fair-schedular-advantages-explained-part2

尝试这些想法来克服等待。希望这有帮助..

多个 Spark 应用程序同时存在，相同的 Jarfile ... 作业处于等待状态

Multiple Spark Applications at same time , same Jarfile... Jobs are in waiting status

hadoop

hadoop-yarn

apache-spark