Apache Spark：设置执行器实例不会更改执行器

Question

我在集群模式下的 YARN 集群上有一个 Apache Spark 应用程序运行（spark 在该集群上有 3 个节点）。

当应用程序运行时，Spark-UI 显示 2 个执行程序（每个运行在不同的节点上）和驱动程序运行在第三个节点。我希望应用程序使用更多执行程序，所以我尝试将参数 --num-executors 添加到 Spark-submit 并将其设置为 6.

spark-submit --driver-memory 3G --num-executors 6 --class main.Application --executor-memory 11G --master yarn-cluster myJar.jar <arg1> <arg2> <arg3> ...

但是，执行者的数量仍然是2。

在 spark UI 上，我可以看到参数 spark.executor.instances 是 6，正如我预期的那样，但不知何故仍然只有 2 个执行程序。

我什至尝试通过代码

设置此参数

sparkConf.set("spark.executor.instances", "6")

同样，我可以看到参数设置为 6，但仍然只有 2 个执行程序。

有谁知道为什么我不能增加执行者的数量？

yarn.nodemanager.resource.memory-mb在纱线中是12g-site.xml

Answer 1

在 yarn-site.xml

中增加 yarn.nodemanager.resource.memory-mb

每个节点 12g，您只能启动驱动程序 (3g) 和 2 个执行程序 (11g)。

Node1 - 驱动程序 3g（+7% 开销）

Node2 - executor1 11g（+7% 开销）

Node3 - executor2 11g（+7% 开销）

现在您正在请求 11g 的 executor3，但没有节点有 11g 可用内存。

对于 7% 的开销，请参考 https://spark.apache.org/docs/1.2.0/running-on-yarn.html

中的 spark.yarn.executor.memoryOverhead 和 spark.yarn.driver.memoryOverhead

Answer 2

你的集群只有3个节点，1个作为driver，你只剩下2个节点，怎么创建6个executor？

我认为你混淆了 --num-executors 和 --executor-cores。

要提高并发性，您需要更多核心，您希望利用集群中的所有 CPU。

Answer 3

要充分利用 spark 集群，您需要根据集群设置 --num-executors、--executor-cores 和 --executor-memory 的值：

--num-executors 命令行标志或 spark.executor.instances 配置属性控制请求的执行程序数；
--executor-cores 命令行标志或 spark.executor.cores 配置属性控制执行程序可以运行 ;
--executor-memory 命令行标志或 spark.executor.memory 配置属性控制堆大小。

Answer 4

请注意 yarn.nodemanager.resource.memory-mb 是总内存单个 NodeManager 可以在一个节点上跨 all 个容器进行分配。

在你的情况下，由于yarn.nodemanager.resource.memory-mb = 12G，如果你将分配给任何单个节点上的所有YARN容器的内存加起来，它不能超过12G。

您已为每个 Spark Executor 容器请求了 11G (-executor-memory 11G)。虽然11G小于12G，但还是不行。为什么？

因为您必须考虑 spark.yarn.executor.memoryOverhead，即 min(executorMemory * 0.10, 384)（默认，除非您覆盖它）。

因此，以下数学必须成立：

spark.executor.memory + spark.yarn.executor.memoryOverhead <= yarn.nodemanager.resource.memory-mb

有关 spark.yarn.executor.memoryOverhead

的最新文档，请参阅：https://spark.apache.org/docs/latest/running-on-yarn.html

此外，spark.executor.instances只是一个请求。您的应用程序的 Spark ApplicationMaster 将向 YARN ResourceManager 请求容器数 = spark.executor.instances。 ResourceManager 将根据以下条件在 NodeManager 节点上授予请求：

节点上的资源可用性。 YARN 调度有其自身的细微差别 - this 是了解 YARN FairScheduler 工作原理的一本很好的入门书。
节点上是否未超过yarn.nodemanager.resource.memory-mb阈值：
- (节点上的 spark 容器数量运行 * (spark.executor.memory + spark.yarn.executor.memoryOverhead)) <= yarn.nodemanager.resource.memory-mb *

如果请求未被授予，请求将被排队并在满足上述条件时被授予。

Apache Spark：设置执行器实例不会更改执行器

Apache Spark: setting executor instances does not change the executors

hadoop-yarn

apache-spark