如何构造一个可用于在 spark/scala 中映射 JavaRDD[org.apache.spark.sql.Row] 的函数?
How to construct a function that can be used for mapping a JavaRDD[org.apache.spark.sql.Row] in spark/scala?
val drdd = Seq(("a", 1), ("b", 2), ("a", 3)).toDF("name", "value").toJavaRDD
drdd.map{ (row: Row) => row.get(0) }
我传递的匿名函数似乎是 Row => Any while it is expecting org.apache.spark.api.java.function.Function[org.apache.spark.sql.Row,?]
<console>:35: error: type mismatch;
found : org.apache.spark.sql.Row => Any
required: org.apache.spark.api.java.function.Function[org.apache.spark.sql.Row,?]
drdd.map{ (row: Row) => row.get(0) }
^
这些函数类型有什么区别,我应该如何构造它们?谢谢!
示例:
drdd.map(new org.apache.spark.api.java.function.Function[Row, String]() {
override def call(row: Row): String = row.getString(0)
})
val drdd = Seq(("a", 1), ("b", 2), ("a", 3)).toDF("name", "value").toJavaRDD
drdd.map{ (row: Row) => row.get(0) }
我传递的匿名函数似乎是 Row => Any while it is expecting org.apache.spark.api.java.function.Function[org.apache.spark.sql.Row,?]
<console>:35: error: type mismatch;
found : org.apache.spark.sql.Row => Any
required: org.apache.spark.api.java.function.Function[org.apache.spark.sql.Row,?]
drdd.map{ (row: Row) => row.get(0) }
^
这些函数类型有什么区别,我应该如何构造它们?谢谢!
示例:
drdd.map(new org.apache.spark.api.java.function.Function[Row, String]() {
override def call(row: Row): String = row.getString(0)
})