Spark无法处理递归avro数据
Spark can not process recursive avro data
我有如下 avsc
架构:
{
"name": "address",
"type": [
"null",
{
"type":"record",
"name":"Address",
"namespace":"com.data",
"fields":[
{
"name":"address",
"type":[ "null","com.data.Address"],
"default":null
}
]
}
],
"default": null
}
在 pyspark 中加载此数据时:
jsonFormatSchema = open("Address.avsc", "r").read()
spark = SparkSession.builder.appName('abc').getOrCreate()
df = spark.read.format("avro")\
.option("avroSchema", jsonFormatSchema)\
.load("xxx.avro")
我遇到了这样的异常:
"Found recursive reference in Avro schema, which can not be processed by Spark"
我尝试了很多其他配置,但都没有成功。
要执行我使用 spark-submit:
--packages org.apache.spark:spark-avro_2.12:3.0.1
这是一个预期的功能,你可以看看“问题”:
我有如下 avsc
架构:
{
"name": "address",
"type": [
"null",
{
"type":"record",
"name":"Address",
"namespace":"com.data",
"fields":[
{
"name":"address",
"type":[ "null","com.data.Address"],
"default":null
}
]
}
],
"default": null
}
在 pyspark 中加载此数据时:
jsonFormatSchema = open("Address.avsc", "r").read()
spark = SparkSession.builder.appName('abc').getOrCreate()
df = spark.read.format("avro")\
.option("avroSchema", jsonFormatSchema)\
.load("xxx.avro")
我遇到了这样的异常:
"Found recursive reference in Avro schema, which can not be processed by Spark"
我尝试了很多其他配置,但都没有成功。 要执行我使用 spark-submit:
--packages org.apache.spark:spark-avro_2.12:3.0.1
这是一个预期的功能,你可以看看“问题”: