使用 spark-submit 部署应用程序:应用程序已添加到调度程序,但尚未激活
Deploying application with spark-submit: Application is added to the scheduler and is not yet activated
我有带 Linux Centos 12G 内存的 VirtualBox。我需要在非分布式配置中将 2 个应用程序部署到 hadoop 运行。这是我的 YARN 配置:
<configuration>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>0.0.0.0:8032</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>130</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>1</value>
<description>Ratio between virtual memory to physical memory when
setting memory limits for containers</description>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>1</value>
</property>
</configuration>
我部署了第一个应用程序并且它运行正常:
spark-submit --master yarn --deploy-mode client --name
OryxBatchLayer-ALSExample --class com.cloudera.oryx.batch.Main
--files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1
--conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4040 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer --conf
spark.speculation=true --conf spark.ui.showConsoleProgress=false
--conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-batch-2.8.0-SNAPSHOT.jar
8088 处的 YARN 管理器指示我正在使用 8 个 vcores 中的 2 个和 8g 内存中的 2 个:
现在我部署我的第二个应用程序:
spark-submit --master yarn --deploy-mode client --name
OryxSpeedLayer-ALSExample --class com.cloudera.oryx.speed.Main --files
oryx.conf --driver-memory 500m --driver-java-options
"-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1
--conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4041 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer --conf
spark.speculation=true --conf spark.ui.showConsoleProgress=false
--conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-speed-2.8.0-SNAPSHOT.jar
但是这次我得到了一个警告,而且第二个应用程序似乎被冻结了,至少它没有分配内存:
2018-08-06 04:49:10 INFO Client:54 -
client token: N/A
diagnostics: [Mon Aug 06 04:49:09 -0400 2018] Application is added to the scheduler and is not yet activated. Queue's AM resource
limit exceeded. Details : AM Partition = ; AM
Resource Request = ; Queue Resource Limit for
AM = ; User AM Resource Limit of the queue =
; Queue AM Resource Usage = ;
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1533545349902
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1533542648791_0002/
user: osboxes
问题的根本原因是什么?如何增加AM的队列资源限制和队列的用户AM资源限制?
解决方法是编辑
~/hadoop-3.1.0/etc/hadoop/capacity-scheduler.xml
并将 .1 更新为 1:
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>1</value>
<description>
Maximum percent of resources in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.
</description>
</property>
我有带 Linux Centos 12G 内存的 VirtualBox。我需要在非分布式配置中将 2 个应用程序部署到 hadoop 运行。这是我的 YARN 配置:
<configuration>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>0.0.0.0:8032</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>130</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>1</value>
<description>Ratio between virtual memory to physical memory when
setting memory limits for containers</description>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>1</value>
</property>
</configuration>
我部署了第一个应用程序并且它运行正常:
spark-submit --master yarn --deploy-mode client --name OryxBatchLayer-ALSExample --class com.cloudera.oryx.batch.Main --files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1 --conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4040 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-batch-2.8.0-SNAPSHOT.jar
8088 处的 YARN 管理器指示我正在使用 8 个 vcores 中的 2 个和 8g 内存中的 2 个:
现在我部署我的第二个应用程序:
spark-submit --master yarn --deploy-mode client --name OryxSpeedLayer-ALSExample --class com.cloudera.oryx.speed.Main --files oryx.conf --driver-memory 500m --driver-java-options "-Dconfig.file=oryx.conf" --executor-memory 500m --executor-cores 1 --conf spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf" --conf spark.ui.port=4041 --conf spark.io.compression.codec=lzf --conf spark.logConf=true --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.speculation=true --conf spark.ui.showConsoleProgress=false --conf spark.dynamicAllocation.enabled=false --num-executors=1 oryx-speed-2.8.0-SNAPSHOT.jar
但是这次我得到了一个警告,而且第二个应用程序似乎被冻结了,至少它没有分配内存:
2018-08-06 04:49:10 INFO Client:54 - client token: N/A diagnostics: [Mon Aug 06 04:49:09 -0400 2018] Application is added to the scheduler and is not yet activated. Queue's AM resource limit exceeded. Details : AM Partition = ; AM Resource Request = ; Queue Resource Limit for AM = ; User AM Resource Limit of the queue = ; Queue AM Resource Usage = ; ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1533545349902 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1533542648791_0002/ user: osboxes
问题的根本原因是什么?如何增加AM的队列资源限制和队列的用户AM资源限制?
解决方法是编辑
~/hadoop-3.1.0/etc/hadoop/capacity-scheduler.xml
并将 .1 更新为 1:
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>1</value>
<description>
Maximum percent of resources in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.
</description>
</property>