无法获取火花案例的输出 类

unable to getoutput for spark case classes

我正在努力实施

使用 spark 2.4.8 和 sbt 版本 1.4.3 使用 intellij

代码:

val sqlContext = new org.apache.spark.sql.SQLContext(sc) 导入 sqlContext.implicits._

case class Person(id:Int,Name:String,cityId:Long)
case class City(id:Long,Name:String)

val family=Seq(Person(1,"john",11),(2,"MAR",12),(3,"Iweta",10)).toDF
val cities=Seq(City(11,"boston"),(12,"dallas")).toDF


error:
Exception in thread "main" java.lang.NoClassDefFoundError: no Java class corresponding to Product with Serializable found
    at scala.reflect.runtime.JavaMirrors$JavaMirror.typeToJavaClass(JavaMirrors.scala:1300)
    at scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:192)
    at scala.reflect.runtime.JavaMirrors$JavaMirror.runtimeClass(JavaMirrors.scala:54)
    at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:60)
    at org.apache.spark.sql.Encoders$.product(Encoders.scala:275)
    at org.apache.spark.sql.LowPrioritySQLImplicits$class.newProductEncoder(SQLImplicits.scala:248)
    at org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:34)
    at usingcaseclass$.main(usingcaseclass.scala:26)
    at usingcaseclass.main(usingcaseclass.scala)

  case class Salary(depName: String, empNo: Long, salary: Long)
val empsalary = Seq(Salary("sales", 1, 5000), Salary("personnel", 2, 3900)).toDS
empsalary.show(false)

value toDS is not a member of Seq[Salary]
val empsalary = Seq(Salary("sales", 1, 5000), Salary("personnel", 2, 3900)).toDS

知道如何防止这个错误

您以错误的方式定义了 Seq,这将导致 Seq[Product with Serializable] 而不是 Seq[T],而 toDF 可以工作。

以下修改后的行应该适合您。

val family=Seq(Person(1,"john",11),Person(2,"MAR",12),Person(3,"Iweta",10))

family.toDF().show()

+---+-----+------+
| id| Name|cityId|
+---+-----+------+
|  1| john|    11|
|  2|  MAR|    12|
|  3|Iweta|    10|
+---+-----+------+