为什么 from_json 会因 "not found : value from_json" 而失败?
Why does from_json fail with "not found : value from_json"?
我正在使用 Spark 2.1.1 (kafka 0.10+) 阅读 Kafka 主题,有效负载是一个 JSON 字符串。我想用模式解析字符串并继续使用业务逻辑。
似乎每个人都建议我应该使用 from_json
来解析 JSON 字符串,但是,它似乎不适合我的情况。错误是
not found : value from_json
.select(from_json($"json", txnSchema) as "data")
当我在 spark shell 中尝试以下行时,它工作得很好 -
val df = stream
.select($"value" cast "string" as "json")
.select(from_json($"json", txnSchema) as "data")
.select("data.*")
任何想法,我在代码中做错了什么才能看到这篇文章在 shell 中工作但在 IDE/compile 时间内不工作?
代码如下:
import org.apache.spark.sql._
object Kafka10Cons3 extends App {
val spark = SparkSession
.builder
.appName(Util.getProperty("AppName"))
.master(Util.getProperty("spark.master"))
.getOrCreate
val stream = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", Util.getProperty("kafka10.broker"))
.option("subscribe", src_topic)
.load
val txnSchema = Util.getTxnStructure
val df = stream
.select($"value" cast "string" as "json")
.select(from_json($"json", txnSchema) as "data")
.select("data.*")
}
您可能只是缺少相关的导入 - import org.apache.spark.sql.functions._
.
您已导入 spark.implicits._
和 org.apache.spark.sql._
,但其中 none 将导入 functions
中的个别函数。
I was also importing com.wizzardo.tools.json
which looks like it also has a from_json
function, which must have been the one the compiler chose (since it was imported first?) and which was apparently incompatible with my version of spark
确保您没有从其他 json 库中导入 from_json
函数,因为该库可能与您使用的 spark 版本不兼容。
我正在使用 Spark 2.1.1 (kafka 0.10+) 阅读 Kafka 主题,有效负载是一个 JSON 字符串。我想用模式解析字符串并继续使用业务逻辑。
似乎每个人都建议我应该使用 from_json
来解析 JSON 字符串,但是,它似乎不适合我的情况。错误是
not found : value from_json
.select(from_json($"json", txnSchema) as "data")
当我在 spark shell 中尝试以下行时,它工作得很好 -
val df = stream
.select($"value" cast "string" as "json")
.select(from_json($"json", txnSchema) as "data")
.select("data.*")
任何想法,我在代码中做错了什么才能看到这篇文章在 shell 中工作但在 IDE/compile 时间内不工作?
代码如下:
import org.apache.spark.sql._
object Kafka10Cons3 extends App {
val spark = SparkSession
.builder
.appName(Util.getProperty("AppName"))
.master(Util.getProperty("spark.master"))
.getOrCreate
val stream = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", Util.getProperty("kafka10.broker"))
.option("subscribe", src_topic)
.load
val txnSchema = Util.getTxnStructure
val df = stream
.select($"value" cast "string" as "json")
.select(from_json($"json", txnSchema) as "data")
.select("data.*")
}
您可能只是缺少相关的导入 - import org.apache.spark.sql.functions._
.
您已导入 spark.implicits._
和 org.apache.spark.sql._
,但其中 none 将导入 functions
中的个别函数。
I was also importing
com.wizzardo.tools.json
which looks like it also has afrom_json
function, which must have been the one the compiler chose (since it was imported first?) and which was apparently incompatible with my version of spark
确保您没有从其他 json 库中导入 from_json
函数,因为该库可能与您使用的 spark 版本不兼容。