java.lang.NoClassDefFoundError: com/databricks/spark/avro/package$

java.lang.NoClassDefFoundError: com/databricks/spark/avro/package$

我正在使用 spark 1.3.0 和 spark-avro 1.0.0。我的 build.sbt 文件看起来像

libraryDependencies ++=Seq(
  "org.apache.spark" % "spark-core_2.10" % "1.3.0" % "provided",
  "org.apache.spark" % "spark-sql_2.10" % "1.5.2" % "provided",
  "com.databricks" % "spark-avro_2.10" % "1.0.0",
  "org.apache.avro" % "avro" % "1.7.7",
  "org.apache.avro" % "avro-mapred" % "1.7.7",
  "org.apache.spark" % "spark-hive_2.10" % "1.0.0" % "provided",
  "joda-time" % "joda-time" % "2.9.2",
  "org.joda" % "joda-convert" % "1.8.1",
  "commons-codec" % "commons-codec" % "1.9"
)

我正在使用程序集插件构建一个 fat jar。

但有时我的代码会因以下错误而失败。如果我执行 jar -tf Fooassembly.jar,我可以在 'com/databricks/spark/avro' 文件夹中看到很多 .class 文件。所以我不确定为什么它会抱怨这个特别的 class.

Exception in thread "main" java.lang.NoClassDefFoundError: com/databricks/spark/avro/package$
        at com.databricks.spark.avro.DefaultSource.createRelation(DefaultSource.scala:78)
        at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:308)
        at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1123)
        at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1083)
        at com.abhi.FormNameMatcher$$anonfun$main.apply(FormNameMatcher.scala:89)
        at com.abhi.FormNameMatcher$$anonfun$main.apply(FormNameMatcher.scala:83)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at com.abhi.FormNameMatcher$.main(FormNameMatcher.scala:83)
        at com.abhi.FormNameMatcher.main(FormNameMatcher.scala)

spark avro 应该与 spark 本身兼容。尝试修复 spark core 和 spark sql(应该是相同版本)之间的兼容性,然后选择 spark-avro 的兼容版本(https://github.com/databricks/spark-avro 参见要求部分)