使用 spark-submit 部署应用程序:应用程序已添加到调度程序,但尚未激活

Deploying application with spark-submit: Application is added to the scheduler and is not yet activated

我有带 Linux Centos 12G 内存的 VirtualBox。我需要在非分布式配置中将 2 个应用程序部署到 hadoop 运行。这是我的 YARN 配置:

<configuration>
<property>
    <name>yarn.nodemanager.pmem-check-enabled</name>
    <value>false</value>
</property>
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.resourcemanager.address</name>
   <value>0.0.0.0:8032</value>
</property>
<property>
  <name>yarn.scheduler.maximum-allocation-vcores</name>
  <value>130</value>
</property>
<property>
   <name>yarn.nodemanager.vmem-check-enabled</name>
   <value>false</value>
   <description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>4048</value>
</property>
<property>
   <name>yarn.nodemanager.vmem-pmem-ratio</name>
   <value>1</value>
   <description>Ratio between virtual memory to physical memory when
setting memory limits for containers</description>
</property>
<property>
   <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
   <value>1</value>
</property>
</configuration>

我部署了第一个应用程序并且它运行正常:

spark-submit --master yarn --deploy-mode client --name OryxBatchLayer-ALSExample --class com.cloudera.oryx.batch.Main --files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1 --conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4040 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-batch-2.8.0-SNAPSHOT.jar

8088 处的 YARN 管理器指示我正在使用 8 个 vcores 中的 2 个和 8g 内存中的 2 个:

现在我部署我的第二个应用程序:

spark-submit --master yarn --deploy-mode client --name OryxSpeedLayer-ALSExample --class com.cloudera.oryx.speed.Main --files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1 --conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4041 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-speed-2.8.0-SNAPSHOT.jar

但是这次我得到了一个警告,而且第二个应用程序似乎被冻结了,至少它没有分配内存:

2018-08-06 04:49:10 INFO Client:54 - client token: N/A diagnostics: [Mon Aug 06 04:49:09 -0400 2018] Application is added to the scheduler and is not yet activated. Queue's AM resource limit exceeded. Details : AM Partition = ; AM Resource Request = ; Queue Resource Limit for AM = ; User AM Resource Limit of the queue = ; Queue AM Resource Usage = ; ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1533545349902 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1533542648791_0002/ user: osboxes

问题的根本原因是什么?如何增加AM的队列资源限制和队列的用户AM资源限制?

解决方法是编辑

~/hadoop-3.1.0/etc/hadoop/capacity-scheduler.xml

并将 .1 更新为 1:

<property>
    <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
    <value>1</value>
    <description>
      Maximum percent of resources in the cluster which can be used to run
      application masters i.e. controls number of concurrent running
      applications.
    </description>
  </property>