为什么 spark-shell 会因 "SymbolTable.exitingPhase...java.lang.NullPointerException" 而失败?

Why does spark-shell fail with "SymbolTable.exitingPhase...java.lang.NullPointerException"?

刚刚在我的 arch linux 中安装了 Apache Spark 2.2.0-4,以及 Scala 2.12.4 和 Apache Hadoop 3.0,一旦我执行 spark-shell.

Exception in thread "main" java.lang.NullPointerException
at scala.reflect.internal.SymbolTable.exitingPhase(SymbolTable.scala:256)
at scala.tools.nsc.interpreter.IMain$Request.x$lzycompute(IMain.scala:896)
at scala.tools.nsc.interpreter.IMain$Request.x(IMain.scala:895)
at scala.tools.nsc.interpreter.IMain$Request.headerPreamble$lzycompute(IMain.scala:895)
at scala.tools.nsc.interpreter.IMain$Request.headerPreamble(IMain.scala:895)
at scala.tools.nsc.interpreter.IMain$Request$Wrapper.preamble(IMain.scala:918)
at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply.apply(IMain.scala:1337)
at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply.apply(IMain.scala:1336)
at scala.tools.nsc.util.package$.stringFromWriter(package.scala:64)
at scala.tools.nsc.interpreter.IMain$CodeAssembler$class.apply(IMain.scala:1336)
at scala.tools.nsc.interpreter.IMain$Request$Wrapper.apply(IMain.scala:908)
at scala.tools.nsc.interpreter.IMain$Request.compile$lzycompute(IMain.scala:1002)
at scala.tools.nsc.interpreter.IMain$Request.compile(IMain.scala:997)
at scala.tools.nsc.interpreter.IMain.compile(IMain.scala:579)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:567)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:98)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:70)
at org.apache.spark.repl.Main$.main(Main.scala:53)
at org.apache.spark.repl.Main.main(Main.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

检查 后,我尝试使用 jdk 8,但这个解决方案对我不起作用。

能否请您提供线索,了解它还能是什么?

编辑 2017-12-30:

这是我的控制台:

[yago@CRISTINA-PC ~]$ java -version
java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)

[yago@CRISTINA-PC ~]$ spark-shell 
/usr/bin/hadoop
WARNING: HADOOP_SLAVES has been replaced by HADOOP_WORKERS. Using 
value of HADOOP_SLAVES.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).

Failed to initialize compiler: object java.lang.Object in compiler 
mirror not found.
** Note that as of 2.8 scala does not assume use of the java 
classpath.
** For the old behavior pass -usejavacp to scala, or if using a 
Settings
** object programmatically, settings.usejavacp.value = true.

Failed to initialize compiler: object java.lang.Object in compiler 
mirror not found.
** Note that as of 2.8 scala does not assume use of the java 
classpath.
** For the old behavior pass -usejavacp to scala, or if using a 
Settings
** object programmatically, settings.usejavacp.value = true.
Exception in thread "main" java.lang.NullPointerException
at scala.reflect.internal.SymbolTable.exitingPhase(SymbolTable.scala:256)
at scala.tools.nsc.interpreter.IMain$Request.x$lzycompute(IMain.scala:896)
at scala.tools.nsc.interpreter.IMain$Request.x(IMain.scala:895)
at scala.tools.nsc.interpreter.IMain$Request.headerPreamble$lzycompute(IMain.scala:895)
at scala.tools.nsc.interpreter.IMain$Request.headerPreamble(IMain.scala:895)
at scala.tools.nsc.interpreter.IMain$Request$Wrapper.preamble(IMain.scala:918)
at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply.apply(IMain.scala:1337)
at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply.apply(IMain.scala:1336)
at scala.tools.nsc.util.package$.stringFromWriter(package.scala:64)
at scala.tools.nsc.interpreter.IMain$CodeAssembler$class.apply(IMain.scala:1336)
at scala.tools.nsc.interpreter.IMain$Request$Wrapper.apply(IMain.scala:908)
at scala.tools.nsc.interpreter.IMain$Request.compile$lzycompute(IMain.scala:1002)
at scala.tools.nsc.interpreter.IMain$Request.compile(IMain.scala:997)
at scala.tools.nsc.interpreter.IMain.compile(IMain.scala:579)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:567)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:98)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:70)
at org.apache.spark.repl.Main$.main(Main.scala:53)
at org.apache.spark.repl.Main.main(Main.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

编辑 2017/12/31

[yago@CRISTINA-PC ~]$ export SPARK_PRINT_LAUNCH_COMMAND=1
[yago@CRISTINA-PC ~]$ spark-shell
/usr/bin/hadoop
WARNING: HADOOP_SLAVES has been replaced by HADOOP_WORKERS. Using 
value of HADOOP_SLAVES.
Spark Command: /usr/lib/jvm/default-runtime/bin/java -cp /opt/apache-
spark/conf/:/opt/apache-spark/jars/*:/etc/hadoop/:/usr/lib/hadoop-
3.0.0/share/hadoop/common/lib/*:/usr/lib/hadoop-
3.0.0/share/hadoop/common/*:/usr/lib/hadoop-
3.0.0/share/hadoop/hdfs/:/usr/lib/hadoop-
3.0.0/share/hadoop/hdfs/lib/*:/usr/lib/hadoop-
3.0.0/share/hadoop/hdfs/*:/usr/lib/hadoop-
3.0.0/share/hadoop/mapreduce/*:/usr/lib/hadoop-
3.0.0/share/hadoop/yarn/:/usr/lib/hadoop-
3.0.0/share/hadoop/yarn/lib/*:/usr/lib/hadoop-
3.0.0/share/hadoop/yarn/* -Dscala.usejavacp=true -Xmx1g 
org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.repl.Main --name Spark shell spark-shell
========================================
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).

Failed to initialize compiler: object java.lang.Object in compiler 
mirror not found.
** Note that as of 2.8 scala does not assume use of the java 
classpath.
** For the old behavior pass -usejavacp to scala, or if using a 
Settings
** object programmatically, settings.usejavacp.value = true.

Failed to initialize compiler: object java.lang.Object in compiler 
mirror not found.
** Note that as of 2.8 scala does not assume use of the java 
classpath.
** For the old behavior pass -usejavacp to scala, or if using a 
Settings
** object programmatically, settings.usejavacp.value = true.
Exception in thread "main" java.lang.NullPointerException
at

...

(same exception as before)

...

[yago@CRISTINA-PC ~]$ /usr/lib/jvm/default-runtime/bin/java -version
java version "9.0.1"
Java(TM) SE Runtime Environment (build 9.0.1+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)

TL;DR Spark 支持Java 8(不支持Java 9)。


Spark 最高 2.3.0-SNAPSHOT(今天由 master 构建)支持 Java 8.

PATH 中使用 Java 9 你会得到你一直面临的异常。

$ java -version
java version "9.0.1"
Java(TM) SE Runtime Environment (build 9.0.1+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)

Failed to initialize compiler: object java.lang.Object in compiler mirror not found.
** Note that as of 2.8 scala does not assume use of the java classpath.
** For the old behavior pass -usejavacp to scala, or if using a Settings
** object programmatically, settings.usejavacp.value = true.

Failed to initialize compiler: object java.lang.Object in compiler mirror not found.
** Note that as of 2.8 scala does not assume use of the java classpath.
** For the old behavior pass -usejavacp to scala, or if using a Settings
** object programmatically, settings.usejavacp.value = true.
Exception in thread "main" java.lang.NullPointerException
    at scala.reflect.internal.SymbolTable.exitingPhase(SymbolTable.scala:256)
    at scala.tools.nsc.interpreter.IMain$Request.x$lzycompute(IMain.scala:896)
    at scala.tools.nsc.interpreter.IMain$Request.x(IMain.scala:895)
    at scala.tools.nsc.interpreter.IMain$Request.headerPreamble$lzycompute(IMain.scala:895)
    at scala.tools.nsc.interpreter.IMain$Request.headerPreamble(IMain.scala:895)
    at scala.tools.nsc.interpreter.IMain$Request$Wrapper.preamble(IMain.scala:918)
    at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply.apply(IMain.scala:1337)
    at scala.tools.nsc.interpreter.IMain$CodeAssembler$$anonfun$apply.apply(IMain.scala:1336)
    at scala.tools.nsc.util.package$.stringFromWriter(package.scala:64)
    at scala.tools.nsc.interpreter.IMain$CodeAssembler$class.apply(IMain.scala:1336)
    at scala.tools.nsc.interpreter.IMain$Request$Wrapper.apply(IMain.scala:908)
    at scala.tools.nsc.interpreter.IMain$Request.compile$lzycompute(IMain.scala:1002)
    at scala.tools.nsc.interpreter.IMain$Request.compile(IMain.scala:997)
    at scala.tools.nsc.interpreter.IMain.compile(IMain.scala:579)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:567)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
    at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
    at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
    at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$$anonfun$apply$mcV$sp$$anonfun$apply$mcV$sp.apply(SparkILoop.scala:79)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$$anonfun$apply$mcV$sp$$anonfun$apply$mcV$sp.apply(SparkILoop.scala:79)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$$anonfun$apply$mcV$sp.apply$mcV$sp(SparkILoop.scala:79)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$$anonfun$apply$mcV$sp.apply(SparkILoop.scala:79)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$$anonfun$apply$mcV$sp.apply(SparkILoop.scala:79)
    at scala.tools.nsc.interpreter.ILoop.savingReplayStack(ILoop.scala:91)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply$mcV$sp(SparkILoop.scala:78)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply(SparkILoop.scala:78)
    at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark.apply(SparkILoop.scala:78)
    at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
    at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:77)
    at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:110)
    at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply$mcZ$sp(ILoop.scala:920)
    at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply(ILoop.scala:909)
    at scala.tools.nsc.interpreter.ILoop$$anonfun$process.apply(ILoop.scala:909)
    at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
    at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
    at org.apache.spark.repl.Main$.doMain(Main.scala:76)
    at org.apache.spark.repl.Main$.main(Main.scala:56)
    at org.apache.spark.repl.Main.main(Main.scala)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:878)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:197)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

紧接着 Java 8 进入 PATH,Spark 正常。

$ java -version
java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)

$ ./bin/spark-shell
17/12/30 20:15:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/12/30 20:15:12 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://192.168.1.2:4041
Spark context available as 'sc' (master = local[*], app id = local-1514661312813).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0-SNAPSHOT
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_152)
Type in expressions to have them evaluated.
Type :help for more information.

scala> spark.version
res0: String = 2.3.0-SNAPSHOT

由于 OP 在 Arch Linux 上并已安装 Apache Spark package from AUR, the Sources section shows 7 files that are customized for the Linux distribution, incl. spark-env.sh

spark-env.sh 中有一段非常有趣的行设置了 JAVA_HOME:

export JAVA_HOME=/usr/lib/jvm/default-runtime

无论 spark-shell 使用的 PATH 环境变量如何,都可以 select Java 9。

PROTIP 您可以使用 SPARK_PRINT_LAUNCH_COMMAND 环境变量来了解 spark-shell 开头的命令和 Java,例如SPARK_PRINT_LAUNCH_COMMAND=1 spark-shell。您还可以查看 sh -x spark-shell,这是更多 Linux 调试 shell 脚本的方式(如 spark-shell)。

一种解决方案是将 /usr/lib/jvm/default-runtime 配置为默认使用 Java 8(而不是 Java 9),但那是......好吧......你的家庭练习。 火花快乐!

只是为了澄清问题,正如您在第二次编辑中看到的那样,spark 使用的是 java 9。 spark-env.sh 将 JAVA_HOME 设置为 :

export JAVA_HOME=/usr/lib/jvm/default-runtime 

要在 arch linux 中设置默认值 jdk,请检查文件夹 /usr/lib/jvm 以查看不同的 jvm 发行版。

要将分发标记为默认,请执行以下操作:

archlinux-java set java-8-jdk

截图

[yago@CRISTINA-PC ~]$ ls /usr/lib/jvm/
default/         default-runtime/ java-8-jdk/      java-8-openjdk/  
java-9-jdk/      
[yago@CRISTINA-PC ~]$ archlinux-java set java-8-jdk
This script must be run as root

[yago@CRISTINA-PC ~]$ sudo archlinux-java set java-8-jdk
[sudo] password for yago: 
[yago@CRISTINA-PC ~]$ sudo archlinux-java set java-8-jdk
[yago@CRISTINA-PC ~]$ spark-shell
/usr/bin/hadoop
WARNING: HADOOP_SLAVES has been replaced by HADOOP_WORKERS. Using 
value of HADOOP_SLAVES.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
2017-12-31 10:47:13,050 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes 
where applicable
Spark context Web UI available at http://127.0.0.1:4040
Spark context available as 'sc' (master = local[*], app id = local-
1514717237307).
Spark session available as 'spark'.
Welcome to
     ____              __
    / __/__  ___ _____/ /__
   _   \ \/ _ \/ _ `/ __/  '_/
  /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
     /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 
1.8.0_152)
Type in expressions to have them evaluated.
Type :help for more information.

scala>