运行 Spark 1.2 无法在 Mac 上以独立模式运行
Can't run Spark 1.2 in standalone mode on Mac
我正在尝试通过 运行 在我的 MacBook Pro (10.9.2) 上以独立模式学习 Spark。我使用此处的说明下载并构建了它:
http://spark.apache.org/docs/latest/building-spark.html
然后我使用此处的说明启动了主服务器:
https://spark.apache.org/docs/latest/spark-standalone.html#starting-a-cluster-manually
虽然我必须将以下行添加到我的 SPARK_HOME/conf/spark-env.sh 文件才能使其成功启动,但它确实有效:
export SPARK_MASTER_IP="127.0.0.1"
但是,当我尝试执行这个命令来启动一个 worker 时:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://127.0.0.1:7077
失败并出现此错误:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 16:29:17 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
15/01/26 16:29:17 INFO SecurityManager: Changing view acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: Changing modify acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(erioconnor); users with modify permissions: Set(erioconnor)
15/01/26 16:29:17 INFO Slf4jLogger: Slf4jLogger started
15/01/26 16:29:17 INFO Remoting: Starting remoting
15/01/26 16:29:17 ERROR NettyTransport: failed to bind to /10.252.181.130:0, shutting down Netty transport
15/01/26 16:29:17 ERROR Remoting: Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort.apply$mcVI$sp(Utils.scala:1676)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.deploy.worker.Worker$.startSystemAndActor(Worker.scala:495)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:475)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /10.252.181.130:0
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.processBatch(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.BindException: Can't assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
]
15/01/26 16:29:17 WARN Utils: Service 'sparkWorker' could not bind on port 0. Attempting port 1.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/01/26 16:29:17 INFO Remoting: Remoting shut down
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
在进程放弃之前错误总共重复了 16 次。
我不明白 IP 地址 10.252.181.130 的引用。它没有出现在我能够找到的 Spark code/config 中的任何地方,谷歌搜索也没有找到任何结果。当我启动主服务器时,我注意到日志文件中的这一行:
15/01/26 16:27:18 INFO MasterWebUI: Started MasterWebUI at http://10.252.181.130:8080
我查看了 MasterWebUI(及其扩展的 WebUI)的 Scala 源代码,并注意到它们似乎是从名为 SPARK_PUBLIC_DNS 的环境变量中获取 IP 地址的。我尝试在我的 spark-env.sh 脚本中将该变量设置为 127.0.0.1,但这没有用。事实上,它阻止了我的主服务器启动:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 19:46:16 INFO Master: Registered signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.net.UnknownHostException: LM-PDX-00871419: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at org.apache.spark.util.Utils$$anonfun$localHostName.apply(Utils.scala:665)
at org.apache.spark.util.Utils$$anonfun$localHostName.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
(注意 LM-PDX-00871419 是我的 Mac 的主机名。)
所以在这一点上我有点难过。如果您有任何关于下一步去哪里的建议,我将不胜感激。
我在使用环回地址 127.0.0.1
或 localhost
时遇到了问题。我用机器的实际 public IP 地址取得了更好的成功,例如 192.168.1.101
。我也不知道 10.252.181.130
是从哪里来的。
对于您 Mac 的主机名,请尝试添加“.local”作为结尾。如果这不起作用,请将条目添加到 /etc/hosts
。
最后,我建议为 运行 这些服务使用 sbin/spark-master.sh
和 sbin/spark-slave.sh
脚本。
如果您使用 IntelliJ IDEA,知道可以在 Run/Debug 配置中将环境变量 SPARK_LOCAL_IP 设置为 127.0.0.1 可能会有所帮助。
我正在尝试通过 运行 在我的 MacBook Pro (10.9.2) 上以独立模式学习 Spark。我使用此处的说明下载并构建了它:
http://spark.apache.org/docs/latest/building-spark.html
然后我使用此处的说明启动了主服务器:
https://spark.apache.org/docs/latest/spark-standalone.html#starting-a-cluster-manually
虽然我必须将以下行添加到我的 SPARK_HOME/conf/spark-env.sh 文件才能使其成功启动,但它确实有效:
export SPARK_MASTER_IP="127.0.0.1"
但是,当我尝试执行这个命令来启动一个 worker 时:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://127.0.0.1:7077
失败并出现此错误:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 16:29:17 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
15/01/26 16:29:17 INFO SecurityManager: Changing view acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: Changing modify acls to: erioconnor
15/01/26 16:29:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(erioconnor); users with modify permissions: Set(erioconnor)
15/01/26 16:29:17 INFO Slf4jLogger: Slf4jLogger started
15/01/26 16:29:17 INFO Remoting: Starting remoting
15/01/26 16:29:17 ERROR NettyTransport: failed to bind to /10.252.181.130:0, shutting down Netty transport
15/01/26 16:29:17 ERROR Remoting: Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort.apply$mcVI$sp(Utils.scala:1676)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.deploy.worker.Worker$.startSystemAndActor(Worker.scala:495)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:475)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: /10.252.181.130:0
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.processBatch(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.BindException: Can't assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
]
15/01/26 16:29:17 WARN Utils: Service 'sparkWorker' could not bind on port 0. Attempting port 1.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/01/26 16:29:17 INFO Remoting: Remoting shut down
15/01/26 16:29:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
在进程放弃之前错误总共重复了 16 次。
我不明白 IP 地址 10.252.181.130 的引用。它没有出现在我能够找到的 Spark code/config 中的任何地方,谷歌搜索也没有找到任何结果。当我启动主服务器时,我注意到日志文件中的这一行:
15/01/26 16:27:18 INFO MasterWebUI: Started MasterWebUI at http://10.252.181.130:8080
我查看了 MasterWebUI(及其扩展的 WebUI)的 Scala 源代码,并注意到它们似乎是从名为 SPARK_PUBLIC_DNS 的环境变量中获取 IP 地址的。我尝试在我的 spark-env.sh 脚本中将该变量设置为 127.0.0.1,但这没有用。事实上,它阻止了我的主服务器启动:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/01/26 19:46:16 INFO Master: Registered signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.net.UnknownHostException: LM-PDX-00871419: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at org.apache.spark.util.Utils$$anonfun$localHostName.apply(Utils.scala:665)
at org.apache.spark.util.Utils$$anonfun$localHostName.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: LM-PDX-00871419: nodename nor servname provided, or not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
(注意 LM-PDX-00871419 是我的 Mac 的主机名。)
所以在这一点上我有点难过。如果您有任何关于下一步去哪里的建议,我将不胜感激。
我在使用环回地址 127.0.0.1
或 localhost
时遇到了问题。我用机器的实际 public IP 地址取得了更好的成功,例如 192.168.1.101
。我也不知道 10.252.181.130
是从哪里来的。
对于您 Mac 的主机名,请尝试添加“.local”作为结尾。如果这不起作用,请将条目添加到 /etc/hosts
。
最后,我建议为 运行 这些服务使用 sbin/spark-master.sh
和 sbin/spark-slave.sh
脚本。
如果您使用 IntelliJ IDEA,知道可以在 Run/Debug 配置中将环境变量 SPARK_LOCAL_IP 设置为 127.0.0.1 可能会有所帮助。