线程中的 Scala 异常 "main" java.lang.NoSuchMethodError
Scala exception in thread "main" java.lang.NoSuchMethodError
我是 Scala 编程新手,正在使用 IntelliJ IDE。 运行 我的 Scala 示例代码出现以下异常。不确定我是否缺少任何依赖项。
示例代码
package com.assessments.example
object Example extends App {
//Create a spark context, using a local master so Spark runs on the local machine
val spark = SparkSession.builder().master("local[*]").appName("ScoringModel").getOrCreate()
//importing spark implicits allows functions such as dataframe.as[T]
import spark.implicits._
//Set logger level to Warn
Logger.getRootLogger.setLevel(Level.WARN)
case class CustomerData(
customerId: String,
forename: String,
surname: String
)
case class FullName(
firstName: String,
surname: String
)
case class CustomerModel(
customerId: String,
forename: String,
surname: String,
fullname: FullName
)
val customerData = spark.read.option("header","true").csv("src/main/resources/customer_data.csv").as[CustomerData]
val customerModel = customerData
.map(
customer =>
CustomerModel(
customerId = customer.customerId,
forename = customer.forename,
surname = customer.surname,
fullname = FullName(
firstName = customer.forename,
surname = customer.surname))
)
customerModel.show(truncate = false)
customerModel.write.mode("overwrite").parquet("src/main/resources/customerModel.parquet")
}
异常消息:
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.mutable.Buffer$.empty()Lscala/collection/GenTraversable;
at org.apache.spark.sql.SparkSessionExtensions.<init>(SparkSessionExtensions.scala:103)
at org.apache.spark.sql.SparkSession$Builder.<init>(SparkSession.scala:793)
at org.apache.spark.sql.SparkSession$.builder(SparkSession.scala:984)
at com.assessments.example.Example$.delayedEndpoint$com$assessments$example$Example(Example.scala:10)
at com.assessments.example.Example$delayedInit$body.apply(Example.scala:6)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main(App.scala:76)
at scala.App.$anonfun$main$adapted(App.scala:76)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at com.assessments.example.Example$.main(Example.scala:6)
at com.assessments.example.Example.main(Example.scala)
我使用的是 spark 版本 3.1.2 和 Scala 版本 2.12.10。当我检查这个版本的 Scala 似乎支持 spark.
感谢有关如何解决此问题的任何指导。谢谢
为了解决您的问题,我不会使用 case-class 作为模式。我也会使用 spark 数据帧。您定义架构如下:
val dataSchema = StructType(Array(
StructField("customerId", StringType, true),
StructField("forename", StringType, true),
StructField("surname", StringType, true)
))
// Load data
val rawDf = context.read.format("csv")
.option("delimiter", ",") //edit accordingly
.option("escape", "\"") //edit accordingly
.option("header", "true")
.option("mode", "PERMISSIVE")
.schema(dataSchema)
.load("src/main/resources") // Will read all the csv in directory
rawDf.show()
一旦您可以看到您的数据,就可以继续进行转换。按照此处的建议创建 Struct 或 Map 数据类型
这是在 pySpark 中,但想法是一样的。你玩sparkSQL。
执行此操作后,只需写信给 parquet。
finalDf.write.mode("overwrite").parquet("src/main/resources/customerModel")
注意输出路径,那里没有文件名。 Spark 将写出 customerModel 目录中的数据。
我是 Scala 编程新手,正在使用 IntelliJ IDE。 运行 我的 Scala 示例代码出现以下异常。不确定我是否缺少任何依赖项。
示例代码
package com.assessments.example
object Example extends App {
//Create a spark context, using a local master so Spark runs on the local machine
val spark = SparkSession.builder().master("local[*]").appName("ScoringModel").getOrCreate()
//importing spark implicits allows functions such as dataframe.as[T]
import spark.implicits._
//Set logger level to Warn
Logger.getRootLogger.setLevel(Level.WARN)
case class CustomerData(
customerId: String,
forename: String,
surname: String
)
case class FullName(
firstName: String,
surname: String
)
case class CustomerModel(
customerId: String,
forename: String,
surname: String,
fullname: FullName
)
val customerData = spark.read.option("header","true").csv("src/main/resources/customer_data.csv").as[CustomerData]
val customerModel = customerData
.map(
customer =>
CustomerModel(
customerId = customer.customerId,
forename = customer.forename,
surname = customer.surname,
fullname = FullName(
firstName = customer.forename,
surname = customer.surname))
)
customerModel.show(truncate = false)
customerModel.write.mode("overwrite").parquet("src/main/resources/customerModel.parquet")
}
异常消息:
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.mutable.Buffer$.empty()Lscala/collection/GenTraversable;
at org.apache.spark.sql.SparkSessionExtensions.<init>(SparkSessionExtensions.scala:103)
at org.apache.spark.sql.SparkSession$Builder.<init>(SparkSession.scala:793)
at org.apache.spark.sql.SparkSession$.builder(SparkSession.scala:984)
at com.assessments.example.Example$.delayedEndpoint$com$assessments$example$Example(Example.scala:10)
at com.assessments.example.Example$delayedInit$body.apply(Example.scala:6)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main(App.scala:76)
at scala.App.$anonfun$main$adapted(App.scala:76)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at com.assessments.example.Example$.main(Example.scala:6)
at com.assessments.example.Example.main(Example.scala)
我使用的是 spark 版本 3.1.2 和 Scala 版本 2.12.10。当我检查这个版本的 Scala 似乎支持 spark.
感谢有关如何解决此问题的任何指导。谢谢
为了解决您的问题,我不会使用 case-class 作为模式。我也会使用 spark 数据帧。您定义架构如下:
val dataSchema = StructType(Array(
StructField("customerId", StringType, true),
StructField("forename", StringType, true),
StructField("surname", StringType, true)
))
// Load data
val rawDf = context.read.format("csv")
.option("delimiter", ",") //edit accordingly
.option("escape", "\"") //edit accordingly
.option("header", "true")
.option("mode", "PERMISSIVE")
.schema(dataSchema)
.load("src/main/resources") // Will read all the csv in directory
rawDf.show()
一旦您可以看到您的数据,就可以继续进行转换。按照此处的建议创建 Struct 或 Map 数据类型
这是在 pySpark 中,但想法是一样的。你玩sparkSQL。
执行此操作后,只需写信给 parquet。
finalDf.write.mode("overwrite").parquet("src/main/resources/customerModel")
注意输出路径,那里没有文件名。 Spark 将写出 customerModel 目录中的数据。