无法找到存储在数据集中的类型的编码器。尽管提供了适当的隐式错误
Unable to find encoder for type stored in a Dataset. error in spite of providing the proper implicits
我正在测试一些基本的 spark 代码,其中我通过从数据源读取数据将数据帧转换为数据集。
import org.apache.spark.sql.SparkSession
object RunnerTest {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder.appName("SparkSessionExample")
.master("local[4]")
.config("spark.sql.warehouse.dir", "target/spark-warehouse")
.getOrCreate
case class Characters(name: String, id: Int)
import spark.implicits._
val path = "examples/src/main/resources/Characters.csv"
val peopleDS = spark.read.csv(path).as[Characters]
}
}
这是太简单的代码,但我收到编译错误,
Error:(42, 43) Unable to find encoder for type Characters. An implicit
Encoder[Characters] is needed to store Characters instances in a
Dataset. Primitive types (Int, String, etc) and Product types (case
classes) are supported by importing spark.implicits._ Support for
serializing other types will be added in future releases.
val peopleDS = spark.read.csv(path).as[Characters]
虽然我使用的是 Spark 2.4 和 sbr 2.12.8。
实际上这里的问题是 case class
在主 object
中。出于某种原因,spark 不喜欢 it.It 是一个愚蠢的错误,但花了一段时间才弄清楚遗漏了什么。一旦我将 case class
移出 object
,它就可以正常编译。
import org.apache.spark.sql.SparkSession
case class Characters(name: String, id: Int)
object RunnerTest {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder.appName("SparkSessionExample")
.master("local[4]")
.config("spark.sql.warehouse.dir", "target/spark-warehouse")
.getOrCreate
import spark.implicits._
val path = "examples/src/main/resources/Characters.csv"
val peopleDS = spark.read.csv(path).as[Characters]
}
}
我正在测试一些基本的 spark 代码,其中我通过从数据源读取数据将数据帧转换为数据集。
import org.apache.spark.sql.SparkSession
object RunnerTest {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder.appName("SparkSessionExample")
.master("local[4]")
.config("spark.sql.warehouse.dir", "target/spark-warehouse")
.getOrCreate
case class Characters(name: String, id: Int)
import spark.implicits._
val path = "examples/src/main/resources/Characters.csv"
val peopleDS = spark.read.csv(path).as[Characters]
}
}
这是太简单的代码,但我收到编译错误,
Error:(42, 43) Unable to find encoder for type Characters. An implicit Encoder[Characters] is needed to store Characters instances in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. val peopleDS = spark.read.csv(path).as[Characters]
虽然我使用的是 Spark 2.4 和 sbr 2.12.8。
实际上这里的问题是 case class
在主 object
中。出于某种原因,spark 不喜欢 it.It 是一个愚蠢的错误,但花了一段时间才弄清楚遗漏了什么。一旦我将 case class
移出 object
,它就可以正常编译。
import org.apache.spark.sql.SparkSession
case class Characters(name: String, id: Int)
object RunnerTest {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder.appName("SparkSessionExample")
.master("local[4]")
.config("spark.sql.warehouse.dir", "target/spark-warehouse")
.getOrCreate
import spark.implicits._
val path = "examples/src/main/resources/Characters.csv"
val peopleDS = spark.read.csv(path).as[Characters]
}
}