使用 Intellij Idea Scala Spark 生成的工件 (.jar) 使用 spark-submit 抛出 ClassNotFoundException
Artifacts (.jar) Generated using with Intellij Idea Scala Spark throws ClassNotFoundException with spark-submit
我已经看过各种相关的建议,但我仍然在挣扎。
我有:
Spark:2.3.1
Scala:2.11.8
OS: Windows 10
IDE:Intellij Idea
Code:
package: testpackage
merge.scala -->has do() method
mymain.scala -->has main() method and inside that merge.do()
Project Settings--Artifacts-->
Main Class-->testpackage.mymain
Class Path--><blank>
什么是有效的:
1. Running properly on IDE
2. JAR creation : Artifacts are getting generated properly as testpackage.jar
3. I can see the classes(along with various other libraries) when I open testpackage.jar in winrar as:
testpackage\merge$$anonfun.class
testpackage\merge$$anonfun.class
testpackage\merge$.class
testpackage\merge.class
testpackage\mymain$.class
testpackage\mymain.class
什么不起作用
spark-submit from command prompt throws exception:
java.lang.ClassNotFoundException: testpackage.mymain
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
更多信息:
1. spark-submit executed from jar folder (~out\artifacts\testpackage_jar)
2. the testpackage.jar is about 128MB
如果我尝试同样的异常:
spark-submit testpackage.jar C:\temp\input.csv
spark-submit --class testpackage.mymain testpackage.jar C:\temp\input.csv
spark-submit --class mymain testpackage.jar C:\temp\input.csv
也在 build.sbt
中尝试使用以下语句
mainClass in (Compile, packageBin) := Some("testpackage.mymain")
也在 build.sbt
中尝试使用以下语句
mainClass in (Compile, packageBin) := Some("mymain")
也尝试将 jar 保存在 Spark bin 文件夹中,在我的机器中它是
C:\Spark\spark-2.3.1-bin-hadoop2.7\bin
尝试使用 --mater local[*] 和其他一些组合。
感谢您的帮助!!
我终于找到了解决办法。以防万一其他人碰到这个。确保只有 1 个条目,即 Project Structure-->Artifacts-->Output Layout
下“'your package.jar'”下的“'your package' compile output”
这很奇怪,我仍然不知道为什么它不起作用!
现在我的 jar 大小是 10KB,效果很好! :)
我已经看过各种相关的建议,但我仍然在挣扎。
我有:
Spark:2.3.1
Scala:2.11.8
OS: Windows 10
IDE:Intellij Idea
Code:
package: testpackage
merge.scala -->has do() method
mymain.scala -->has main() method and inside that merge.do()
Project Settings--Artifacts-->
Main Class-->testpackage.mymain
Class Path--><blank>
什么是有效的:
1. Running properly on IDE
2. JAR creation : Artifacts are getting generated properly as testpackage.jar
3. I can see the classes(along with various other libraries) when I open testpackage.jar in winrar as:
testpackage\merge$$anonfun.class
testpackage\merge$$anonfun.class
testpackage\merge$.class
testpackage\merge.class
testpackage\mymain$.class
testpackage\mymain.class
什么不起作用
spark-submit from command prompt throws exception:
java.lang.ClassNotFoundException: testpackage.mymain
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
更多信息:
1. spark-submit executed from jar folder (~out\artifacts\testpackage_jar)
2. the testpackage.jar is about 128MB
如果我尝试同样的异常:
spark-submit testpackage.jar C:\temp\input.csv
spark-submit --class testpackage.mymain testpackage.jar C:\temp\input.csv
spark-submit --class mymain testpackage.jar C:\temp\input.csv
也在 build.sbt
中尝试使用以下语句mainClass in (Compile, packageBin) := Some("testpackage.mymain")
也在 build.sbt
中尝试使用以下语句mainClass in (Compile, packageBin) := Some("mymain")
也尝试将 jar 保存在 Spark bin 文件夹中,在我的机器中它是
C:\Spark\spark-2.3.1-bin-hadoop2.7\bin
尝试使用 --mater local[*] 和其他一些组合。
感谢您的帮助!!
我终于找到了解决办法。以防万一其他人碰到这个。确保只有 1 个条目,即 Project Structure-->Artifacts-->Output Layout
下“'your package.jar'”下的“'your package' compile output”这很奇怪,我仍然不知道为什么它不起作用!
现在我的 jar 大小是 10KB,效果很好! :)