无法在 EMR 5.0 HUE 上实例化 SparkSession
Can't instantiate SparkSession on EMR 5.0 HUE
我是 运行 EMR 5.0 集群,我正在使用 HUE 创建 OOZIE 工作流来提交 SPARK 2.0 作业。我有 运行 作业,直接在 YARN 上进行 spark-submit,并作为同一集群上的一个步骤。没问题。但是当我使用 HUE 执行此操作时,出现以下错误:
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SessionState':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:111)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:110)
at org.apache.spark.sql.SparkSession.conf$lzycompute(SparkSession.scala:133)
at org.apache.spark.sql.SparkSession.conf(SparkSession.scala:133)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate.apply(SparkSession.scala:838)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate.apply(SparkSession.scala:838)
at scala.collection.mutable.HashMap$$anonfun$foreach.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$$anonfun$foreach.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:838)
at be.infofarm.App$.main(App.scala:22)
at be.infofarm.App.main(App.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:627)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
... 19 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SharedState':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:100)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:100)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:99)
at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:98)
at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:153)
... 24 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
... 30 more
Caused by: java.lang.Exception: Could not find resource path for Web UI: org/apache/spark/sql/execution/ui/static
at org.apache.spark.ui.JettyUtils$.createStaticHandler(JettyUtils.scala:182)
at org.apache.spark.ui.WebUI.addStaticHandler(WebUI.scala:119)
at org.apache.spark.sql.execution.ui.SQLTab.<init>(SQLTab.scala:32)
at org.apache.spark.sql.internal.SharedState$$anonfun$createListenerAndUI.apply(SharedState.scala:96)
at org.apache.spark.sql.internal.SharedState$$anonfun$createListenerAndUI.apply(SharedState.scala:96)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.internal.SharedState.createListenerAndUI(SharedState.scala:96)
at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:44)
... 35 more
当我在我的 Spark 作业中不使用 spark.sql 或 SparkSession(而是使用 SparkContext)时,它运行良好。如果有人知道发生了什么,我将不胜感激。
编辑 1
我的 maven 程序集
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.3</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>be.infofarm.App</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
当您 运行 使用 spark-submit jar 时,所有相关的 jar 都在机器的类路径上可用,但是当您使用 oozie 执行相同的 jar 时,这些 jar 在 Oozie 的 'sharelib' 中不可用。
您可以通过执行以下命令来检查相同内容
oozie admin -shareliblist spark
第 1 步。将所需的 jar 从本地计算机上传到 HDFS
hdfs dfs -put /usr/lib/spark/jars/*.jar /user/oozie/share/lib/lib_timestamp/spark/
仅将 jars 上传到 HDFS 不会将它们添加到 sharelib 您需要通过执行
来更新 sharelib
oozie admin -sharelibupdate
希望这对您有所帮助
我是 运行 EMR 5.0 集群,我正在使用 HUE 创建 OOZIE 工作流来提交 SPARK 2.0 作业。我有 运行 作业,直接在 YARN 上进行 spark-submit,并作为同一集群上的一个步骤。没问题。但是当我使用 HUE 执行此操作时,出现以下错误:
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SessionState':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:111)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:110)
at org.apache.spark.sql.SparkSession.conf$lzycompute(SparkSession.scala:133)
at org.apache.spark.sql.SparkSession.conf(SparkSession.scala:133)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate.apply(SparkSession.scala:838)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate.apply(SparkSession.scala:838)
at scala.collection.mutable.HashMap$$anonfun$foreach.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$$anonfun$foreach.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:838)
at be.infofarm.App$.main(App.scala:22)
at be.infofarm.App.main(App.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:627)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
... 19 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SharedState':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:100)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:100)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:99)
at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:98)
at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:153)
... 24 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
... 30 more
Caused by: java.lang.Exception: Could not find resource path for Web UI: org/apache/spark/sql/execution/ui/static
at org.apache.spark.ui.JettyUtils$.createStaticHandler(JettyUtils.scala:182)
at org.apache.spark.ui.WebUI.addStaticHandler(WebUI.scala:119)
at org.apache.spark.sql.execution.ui.SQLTab.<init>(SQLTab.scala:32)
at org.apache.spark.sql.internal.SharedState$$anonfun$createListenerAndUI.apply(SharedState.scala:96)
at org.apache.spark.sql.internal.SharedState$$anonfun$createListenerAndUI.apply(SharedState.scala:96)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.internal.SharedState.createListenerAndUI(SharedState.scala:96)
at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:44)
... 35 more
当我在我的 Spark 作业中不使用 spark.sql 或 SparkSession(而是使用 SparkContext)时,它运行良好。如果有人知道发生了什么,我将不胜感激。
编辑 1
我的 maven 程序集
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.3</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>be.infofarm.App</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
当您 运行 使用 spark-submit jar 时,所有相关的 jar 都在机器的类路径上可用,但是当您使用 oozie 执行相同的 jar 时,这些 jar 在 Oozie 的 'sharelib' 中不可用。 您可以通过执行以下命令来检查相同内容
oozie admin -shareliblist spark
第 1 步。将所需的 jar 从本地计算机上传到 HDFS
hdfs dfs -put /usr/lib/spark/jars/*.jar /user/oozie/share/lib/lib_timestamp/spark/
仅将 jars 上传到 HDFS 不会将它们添加到 sharelib 您需要通过执行
来更新 shareliboozie admin -sharelibupdate
希望这对您有所帮助