无法向 Kryo 注册 类

Failed to register classes with Kryo

我正在使用-

创建一个 spark 上下文
(ns something
  (:require [flambo.conf : conf]
                 [flambo.api :as f]))
(def c (-> (conf/spark-conf)
           (conf/master "spark://formcept008.lan:7077") 
           (conf/app-name "clustering")))  ;; app-name   
(def sc (f/spark-context c))

然后我正在创建一个RDD-

(f/parallelize sc DATA)

现在,当我对这些数据执行一些操作时,例如 (f/take rdd 3) 等,我收到一个错误-

17/11/28 14:35:00 ERROR Utils: Exception encountered org.apache.spark.SparkException: Failed to register classes with Kryo at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:129) at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:274) at org.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:259) at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:175) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject.apply$mcV$sp(ParallelCollectionRDD.scala:79) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject.apply(ParallelCollectionRDD.scala:70) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1273) at org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: flambo.kryo.BaseFlamboRegistrator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo.apply(KryoSerializer.scala:124) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo.apply(KryoSerializer.scala:124) at scala.collection.TraversableLike$$anonfun$map.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:124) ... 27 more 17/11/28 14:35:00 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

有任何想法,请。

flambo 似乎不在你的类路径中,这就是你得到的原因:

java.lang.ClassNotFoundException: flambo.kryo.BaseFlamboRegistrator

你 运行 这是来自 REPL 还是你在使用 lein 或 boot 任务?

如果您使用的是 leiningen,请检查您的类路径 (lein classpath) 和依赖关系树 (lein deps :tree)

另外,lein clean 确保您的目标文件夹不会导致问题

堆栈跟踪分析: Failed to register classes with Kryo 是因为缺少 flambo.kryo.BaseFlamboRegistrator

已解决。 使用 -

在 spark-configuration 中添加项目的所有 jar
(conf/jars (map #(.getPath % (.getURLs(java.lang.ClassLoader/getSystemClassLoader))))

它将注册所有类。 由于此问题已解决,因此关闭它。