错误实用程序：线程 SparkListenerBus 中未捕获的异常

Question

我尝试使用 Apache Spark 执行简单的项目。这是我的代码 SimpleApp.scala

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "/home/hduser/spark-1.2.0-bin-hadoop2.4/README.md" // Should be some file on your system
    // val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext("local", "Simple Job", "/home/hduser/spark-1.2.0-bin-hadoop2.4/")
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("hadoop")).count()
    val numBs = logData.filter(line => line.contains("see")).count()
    println("Lines with hadoop: %s, Lines with see: %s".format(numAs, numBs))
  }
}

当我使用命令行手动将此作业发送到 Spark 时：/home/hduser/spark-1.2.0-hadoop-2.4.0/bin/spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar 运行成功。

如果我运行使用 sbt run 并且服务 apache spark 是运行ning，它是成功的，但是在日志的最后它给出这样的错误：

15/02/06 15:56:49 ERROR Utils: Uncaught exception in thread SparkListenerBus
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:996)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:317)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$$anonfun$run.apply$mcV$sp(LiveListenerBus.scala:48)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$$anonfun$run.apply(LiveListenerBus.scala:47)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$$anonfun$run.apply(LiveListenerBus.scala:47)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460)
    at org.apache.spark.scheduler.LiveListenerBus$$anon.run(LiveListenerBus.scala:46)
15/02/06 15:56:49 ERROR ContextCleaner: Error in cleaning thread
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning.apply$mcV$sp(ContextCleaner.scala:136)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning.apply(ContextCleaner.scala:134)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning.apply(ContextCleaner.scala:134)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460)
    at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:133)
    at org.apache.spark.ContextCleaner$$anon.run(ContextCleaner.scala:65)

我的代码有什么问题吗？提前致谢。我使用apache spark 1.2.0-bin-hadoop-2.4, scala 2.10.4

Answer 1

根据this mail archive，即：

Hi Haoming,

You can safely disregard this error. This is printed at the end of the execution when we clean up and kill the daemon context cleaning thread. In the future it would be good to silence this particular message, as it may be confusing to users.

Andrew

错误可以忽略。

Answer 2

当 Spark 代码为运行时，应通过添加 sc.stop 或 [=13= 来停止 SparkContext 或 SparkSession (Spark >= 2.0.0) ] (Spark >= 2.0.0) 在代码末尾。

错误实用程序：线程 SparkListenerBus 中未捕获的异常

ERROR Utils: Uncaught exception in thread SparkListenerBus

scala

apache-spark