火花核心 1.6.1 & 提升-json 2.6.3 java.lang.NoClassDefFoundError
spark-core 1.6.1 & lift-json 2.6.3 java.lang.NoClassDefFoundError
我有一个 Spark 应用程序,它有一个如下所示的 sbt 文件。
它适用于我的本地机器。但是当我将它提交到 EMR 运行 Spark 1.6.1 时,出现如下错误:
java.lang.NoClassDefFoundError: net/liftweb/json/JsonAST$JValue
我正在使用 "sbt-package" 获取 jar
Build.sbt:
organization := "com.foo"
name := "FooReport"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1"
,"net.liftweb" % "lift-json_2.10" % "2.6.3"
,"joda-time" % "joda-time" % "2.9.4"
)
你知道发生了什么事吗?
我找到了解决办法,而且很有效!
问题出在 sbt package
上,它没有包含输出 jar 的所有依赖 jar。为了克服这个问题,我尝试了 sbt-assembly
,但是当我 运行 它时,我得到了很多“deduplicate”错误。
毕竟我来到这个博客 post 这让一切都清楚了。
http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin
In order to submit Spark jobs to a Spark Cluster (via spark-submit),
you need to include all dependencies (other than Spark itself) in the
Jar, otherwise you won't be able to use those in your job.
- 在 /project 文件夹下创建 "assembly.sbt"。
- 添加这一行
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
- 然后将下面的 assemblyMergeStrategy 代码粘贴到您的 build.sbt
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
和运行sbt assembly
现在你有了一个包含所有依赖项的大罐子。基于依赖库,它可能是数百 MB。对于我的情况,我使用的是已经安装了 Spark 1.6.1 的 Aws EMR。要从您的 jar 中排除 spark-core lib,您可以使用 "provided" 关键字:
"org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
这是最终的 build.sbt 文件:
organization := "com.foo"
name := "FooReport"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
,"net.liftweb" % "lift-json_2.10" % "2.6.3"
,"joda-time" % "joda-time" % "2.9.4"
)
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
我有一个 Spark 应用程序,它有一个如下所示的 sbt 文件。
它适用于我的本地机器。但是当我将它提交到 EMR 运行 Spark 1.6.1 时,出现如下错误:
java.lang.NoClassDefFoundError: net/liftweb/json/JsonAST$JValue
我正在使用 "sbt-package" 获取 jar
Build.sbt:
organization := "com.foo"
name := "FooReport"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1"
,"net.liftweb" % "lift-json_2.10" % "2.6.3"
,"joda-time" % "joda-time" % "2.9.4"
)
你知道发生了什么事吗?
我找到了解决办法,而且很有效!
问题出在 sbt package
上,它没有包含输出 jar 的所有依赖 jar。为了克服这个问题,我尝试了 sbt-assembly
,但是当我 运行 它时,我得到了很多“deduplicate”错误。
毕竟我来到这个博客 post 这让一切都清楚了。
http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin
In order to submit Spark jobs to a Spark Cluster (via spark-submit), you need to include all dependencies (other than Spark itself) in the Jar, otherwise you won't be able to use those in your job.
- 在 /project 文件夹下创建 "assembly.sbt"。
- 添加这一行
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
- 然后将下面的 assemblyMergeStrategy 代码粘贴到您的 build.sbt
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
和运行sbt assembly
现在你有了一个包含所有依赖项的大罐子。基于依赖库,它可能是数百 MB。对于我的情况,我使用的是已经安装了 Spark 1.6.1 的 Aws EMR。要从您的 jar 中排除 spark-core lib,您可以使用 "provided" 关键字:
"org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
这是最终的 build.sbt 文件:
organization := "com.foo"
name := "FooReport"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
,"net.liftweb" % "lift-json_2.10" % "2.6.3"
,"joda-time" % "joda-time" % "2.9.4"
)
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}