错误 yarn.ApplicationMaster：用户 class 抛出异常：java.lang.reflect.InvocationTargetException java.lang.reflect.InvocationTargetException

Question

我在 yarn 集群上运行 Spark 作业时遇到错误。我做了几次 jar 并且运行成功了。我不知道这一次我连一个简单的 WordCount 程序都无法运行。这是我得到的错误。

16/04/06 20:38:13 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/04/06 20:38:13 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
16/04/06 20:38:13 ERROR yarn.ApplicationMaster: User class threw exception: null
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
    at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
    at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
    at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:79)
    at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
    at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
    at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
    at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)
    at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:761)
    at org.apache.spark.SparkContext.textFile(SparkContext.scala:589)
    at com.demo.WordCountSimple$.main(WordCountSimple.scala:24)
    at com.demo.WordCountSimple.main(WordCountSimple.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:480)
Caused by: java.lang.IllegalArgumentException
    at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152)
    ... 21 more
16/04/06 20:38:13 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: null)
16/04/06 20:38:13 INFO yarn.ApplicationMaster: Invoking sc stop from shutdown hook
16/04/06 20:38:13 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/04/06 20:38:13 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}

我将 Spark 1.6.0 与 Scala 2.11.7 一起使用，我的 sbt 如下-

import sbt.Keys._

lazy val root = (project in file(".")).
  settings(
    name := "SparkTutorials",
    version := "1.0",
    scalaVersion := "2.11.7",
    mainClass in Compile := Some("WordCountSimple")
  )

exportJars := true
fork := true

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.6.0" % "provided",
  "org.apache.spark" %% "spark-streaming" % "1.6.0",
  "org.apache.spark" %% "spark-mllib" % "1.6.0",
"org.apache.spark" %% "spark-sql" % "1.6.0")

assemblyJarName := "WordCountSimple.jar"
//
val meta = """META.INF(.)*""".r

assemblyMergeStrategy in assembly := {
  case PathList("javax", "servlet", xs@_*) => MergeStrategy.first
  case PathList(ps@_*) if ps.last endsWith ".html" => MergeStrategy.first
  case n if n.startsWith("reference.conf") => MergeStrategy.concat
  case n if n.endsWith(".conf") => MergeStrategy.concat
  case meta(_) => MergeStrategy.discard
  case x => MergeStrategy.first
}

在 jar 提交过程中，我是这样做的-

./bin/spark-submit --class  com.demo.WordCountSimple  --master yarn-cluster  --num-executors 8 --executor-memory 4g --executor-cores 10 /users/hastimal/WordCountSimple.jar   /test/sample.txt /test/output

我正在使用 Spark GraphX 做一些其他事情，但因为它显示了同样的错误，所以我想首先进行 WordCount 测试。仍然是同样的错误。我关注了 link and also stack 但没有运气。 Jar有什么问题吗？或集群中的任何问题？或依赖项中的任何问题？请帮助我！！

仅供参考：代码-

package com.demo

import java.util.Calendar

import org.apache.spark.{SparkContext, SparkConf}

/**
 * Created by hastimal on 3/14/2016.
 */
object WordCountSimple {
  def main(args: Array[String]) {
    //System.setProperty("hadoop.home.dir","F:\winutils")
    if (args.length < 2) {
      System.err.println("Usage: WordCountSimple <inputPath> <outputPath>")
      System.exit(1)
    }
    val inputPath = args(0)   //input path as variable argument
    val outputPath = args(1)  //output path as variable argument
    // Create a Scala Spark Context.
    val conf = new SparkConf().setAppName("WordCountSimple")
    val sc = new SparkContext(conf)
    val startTime = Calendar.getInstance().getTime()
    println("startTime "+startTime)
//    val input = sc.textFile(inputPath,8)
    val input = sc.textFile(inputPath,4)
      // Split it up into words.
    val words = input.flatMap(line => line.split(" "))
    val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
    counts.saveAsTextFile(outputPath)
    //counts.foreach(println(_))
    val endTime = Calendar.getInstance().getTime()
    println("endTime "+endTime)
    val totalTime = endTime.getTime-startTime.getTime
    println("totalTime "+totalTime)
  }
}

Answer 1

问题出在 snapply 的 IO 上，所以当我提交作业时是这样的： ./bin/spark-submit --class com.demo.WordCountSimple --master yarn-cluster --num-executors 8 --executor-memory 4g --executor-cores 10 --conf spark.io.compression.codec=lz4 /users/hastimal/WordCountSimple.jar /test/sample.txt /test/output 我成功了！！感谢@zsxwing

错误 yarn.ApplicationMaster：用户 class 抛出异常：java.lang.reflect.InvocationTargetException java.lang.reflect.InvocationTargetException

ERROR yarn.ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException java.lang.reflect.InvocationTargetException

scala

hadoop-yarn

sbt-assembly

apache-spark

hadoop2