无法 运行 Apache Spark 集群上的 sparkPi
Unable to run sparkPi on Apache Spark cluster
以下是我的 spark master UI,其中显示了 1 个注册工人。
我正在尝试 运行 集群上的 sparkPi 应用程序,使用以下提交脚本
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://159.8.201.251:7077 \
/opt/Spark/spark-1.2.1-bin-cdh4/lib/spark-examples-1.2.1-hadoop2.0.0-mr1-cdh4.2.0.jar \
1
但它一直发出以下警告,并且从未完成执行:
WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
我使用 ./sbin/start-master.sh
启动 master
我使用 ./bin/spark-class org.apache.spark.deploy.worker.Worker spark://x.x.x.x:7077
连接工作人员
登录 Master(不断重复)
15/05/01 01:16:48 INFO AppClient$ClientActor: Executor added: app-20150501005353-0000/40 on worker-20150501004757-spark-worker30-04-2015-23-11-51-1.abc.com-48624 (spark-worker30-04-2015-23-11-51-1.abc.com:48624) with 1 cores
15/05/01 01:16:48 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150501005353-0000/40 on hostPort spark-worker30-04-2015-23-11-51-1.abc.com:48624 with 1 cores, 512.0 MB RAM
15/05/01 01:16:48 INFO AppClient$ClientActor: Executor updated: app-20150501005353-0000/40 is now RUNNING
15/05/01 01:16:48 INFO AppClient$ClientActor: Executor updated: app-20150501005353-0000/40 is now LOADING
15/05/01 01:16:55 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/05/01 01:17:10 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/05/01 01:17:23 INFO AppClient$ClientActor: Executor updated: app-20150501005353-0000/40 is now EXITED (Command exited with code 1)
15/05/01 01:17:23 INFO SparkDeploySchedulerBackend: Executor app-20150501005353-0000/40 removed: Command exited with code 1
15/05/01 01:17:23 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 40
工人的日志(不断重复)
15/05/01 01:13:56 INFO Worker: Executor app-20150501005353-0000/34 finished with state EXITED message Command exited with code 1 exitStatus 1
15/05/01 01:13:56 INFO Worker: Asked to launch executor app-20150501005353-0000/35 for Spark Pi
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/05/01 01:13:58 INFO ExecutorRunner: Launch command: "java" "-cp" "::/opt/Spark/spark-1.2.1-bin-cdh4/conf:/opt/Spark/spark-1.2.1-bin-cdh4/lib/spark-assembly-1.2.1-hadoop2.0.0-mr1-cdh4.2.0.jar:/opt/Spark/spark-1.2.1-bin-cdh4/lib/datanucleus-core-3.2.10.jar:/opt/Spark/spark-1.2.1-bin-cdh4/lib/datanucleus-rdbms-3.2.9.jar:/opt/Spark/spark-1.2.1-bin-cdh4/lib/datanucleus-api-jdo-3.2.6.jar" "-Dspark.driver.port=48714" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriver@spark-master-node30-04-2015-23-01-40.abc.com:48714/user/CoarseGrainedScheduler" "35" "spark-worker30-04-2015-23-11-51-1.abc.com" "1" "app-20150501005353-0000" "akka.tcp://sparkWorker@spark-worker30-04-2015-23-11-51-1.abc.com:48624/user/Worker"
15/05/01 01:14:31 INFO Worker: Executor app-20150501005353-0000/35 finished with state EXITED message Command exited with code 1 exitStatus 1
15/05/01 01:14:31 INFO Worker: Asked to launch executor app-20150501005353-0000/36 for Spark Pi
此错误的原因是:工作人员无法真正连接到主节点,因为 spark 主节点的 IP 和主机名不存在于 /etc/hosts 文件中工人。为了使集群正常工作,每个节点都必须在其 /etc/hosts 文件中包含集群中每个其他节点的主机条目。
例如:
127.0.0.1 localhost.localdomain localhost
10.0.2.12 master.example.com master
10.0.2.13 worker1.example.com worker1
10.0.2.13 worker2.example.com worker2
以下是我的 spark master UI,其中显示了 1 个注册工人。 我正在尝试 运行 集群上的 sparkPi 应用程序,使用以下提交脚本
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://159.8.201.251:7077 \
/opt/Spark/spark-1.2.1-bin-cdh4/lib/spark-examples-1.2.1-hadoop2.0.0-mr1-cdh4.2.0.jar \
1
但它一直发出以下警告,并且从未完成执行:
WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
我使用 ./sbin/start-master.sh
启动 master
我使用 ./bin/spark-class org.apache.spark.deploy.worker.Worker spark://x.x.x.x:7077
15/05/01 01:16:48 INFO AppClient$ClientActor: Executor added: app-20150501005353-0000/40 on worker-20150501004757-spark-worker30-04-2015-23-11-51-1.abc.com-48624 (spark-worker30-04-2015-23-11-51-1.abc.com:48624) with 1 cores
15/05/01 01:16:48 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150501005353-0000/40 on hostPort spark-worker30-04-2015-23-11-51-1.abc.com:48624 with 1 cores, 512.0 MB RAM
15/05/01 01:16:48 INFO AppClient$ClientActor: Executor updated: app-20150501005353-0000/40 is now RUNNING
15/05/01 01:16:48 INFO AppClient$ClientActor: Executor updated: app-20150501005353-0000/40 is now LOADING
15/05/01 01:16:55 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/05/01 01:17:10 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/05/01 01:17:23 INFO AppClient$ClientActor: Executor updated: app-20150501005353-0000/40 is now EXITED (Command exited with code 1)
15/05/01 01:17:23 INFO SparkDeploySchedulerBackend: Executor app-20150501005353-0000/40 removed: Command exited with code 1
15/05/01 01:17:23 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 40
工人的日志(不断重复)
15/05/01 01:13:56 INFO Worker: Executor app-20150501005353-0000/34 finished with state EXITED message Command exited with code 1 exitStatus 1
15/05/01 01:13:56 INFO Worker: Asked to launch executor app-20150501005353-0000/35 for Spark Pi
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/05/01 01:13:58 INFO ExecutorRunner: Launch command: "java" "-cp" "::/opt/Spark/spark-1.2.1-bin-cdh4/conf:/opt/Spark/spark-1.2.1-bin-cdh4/lib/spark-assembly-1.2.1-hadoop2.0.0-mr1-cdh4.2.0.jar:/opt/Spark/spark-1.2.1-bin-cdh4/lib/datanucleus-core-3.2.10.jar:/opt/Spark/spark-1.2.1-bin-cdh4/lib/datanucleus-rdbms-3.2.9.jar:/opt/Spark/spark-1.2.1-bin-cdh4/lib/datanucleus-api-jdo-3.2.6.jar" "-Dspark.driver.port=48714" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriver@spark-master-node30-04-2015-23-01-40.abc.com:48714/user/CoarseGrainedScheduler" "35" "spark-worker30-04-2015-23-11-51-1.abc.com" "1" "app-20150501005353-0000" "akka.tcp://sparkWorker@spark-worker30-04-2015-23-11-51-1.abc.com:48624/user/Worker"
15/05/01 01:14:31 INFO Worker: Executor app-20150501005353-0000/35 finished with state EXITED message Command exited with code 1 exitStatus 1
15/05/01 01:14:31 INFO Worker: Asked to launch executor app-20150501005353-0000/36 for Spark Pi
此错误的原因是:工作人员无法真正连接到主节点,因为 spark 主节点的 IP 和主机名不存在于 /etc/hosts 文件中工人。为了使集群正常工作,每个节点都必须在其 /etc/hosts 文件中包含集群中每个其他节点的主机条目。 例如:
127.0.0.1 localhost.localdomain localhost
10.0.2.12 master.example.com master
10.0.2.13 worker1.example.com worker1
10.0.2.13 worker2.example.com worker2