如何为复杂的 json 文档定义 avro 模式?
How to define avro schema for complex json document?
我有一个 JSON 文档,我想将其转换为 Avro,并且需要为此目的指定一个架构。这是我要为其定义 avro 架构的 JSON 文档:
{
"uid": 29153333,
"somefield": "somevalue",
"options": [
{
"item1_lvl2": "a",
"item2_lvl2": [
{
"item1_lvl3": "x1",
"item2_lvl3": "y1"
},
{
"item1_lvl3": "x2",
"item2_lvl3": "y2"
}
]
}
]
}
我可以为非复杂类型定义架构,但不能为复杂的 "options" 字段定义架构:
{
"namespace" : "my.com.ns",
"type" : "record",
"fields" : [
{"name": "uid", "type": "int"},
{"name": "somefield", "type": "string"}
{"name": "options", "type": .....}
]
}
感谢您的帮助!
您需要使用 Avro complex types, specifically arrays and records。然后将它们嵌套在一起:
{
"namespace" : "my.com.ns",
"name": "myrecord",
"type" : "record",
"fields" : [
{"name": "uid", "type": "int"},
{"name": "somefield", "type": "string"},
{"name": "options", "type": {
"type": "array",
"items": {
"type": "record",
"name": "lvl2_record",
"fields": [
{"name": "item1_lvl2", "type": "string"},
{"name": "item2_lvl2", "type": {
"type": "array",
"items": {
"type": "record",
"name": "lvl3_record",
"fields": [
{"name": "item1_lvl3", "type": "string"},
{"name": "item2_lvl3", "type": "string"}
]
}
}}
]
}
}}
]
}
此外,为了提高可读性,您可以 split the schema into multiple files。
这个在线工具(http://avro4s-ui.landoop.com/)非常实用,可以通过给定的有效json.
生成AVRO schema
我有一个 JSON 文档,我想将其转换为 Avro,并且需要为此目的指定一个架构。这是我要为其定义 avro 架构的 JSON 文档:
{
"uid": 29153333,
"somefield": "somevalue",
"options": [
{
"item1_lvl2": "a",
"item2_lvl2": [
{
"item1_lvl3": "x1",
"item2_lvl3": "y1"
},
{
"item1_lvl3": "x2",
"item2_lvl3": "y2"
}
]
}
]
}
我可以为非复杂类型定义架构,但不能为复杂的 "options" 字段定义架构:
{
"namespace" : "my.com.ns",
"type" : "record",
"fields" : [
{"name": "uid", "type": "int"},
{"name": "somefield", "type": "string"}
{"name": "options", "type": .....}
]
}
感谢您的帮助!
您需要使用 Avro complex types, specifically arrays and records。然后将它们嵌套在一起:
{
"namespace" : "my.com.ns",
"name": "myrecord",
"type" : "record",
"fields" : [
{"name": "uid", "type": "int"},
{"name": "somefield", "type": "string"},
{"name": "options", "type": {
"type": "array",
"items": {
"type": "record",
"name": "lvl2_record",
"fields": [
{"name": "item1_lvl2", "type": "string"},
{"name": "item2_lvl2", "type": {
"type": "array",
"items": {
"type": "record",
"name": "lvl3_record",
"fields": [
{"name": "item1_lvl3", "type": "string"},
{"name": "item2_lvl3", "type": "string"}
]
}
}}
]
}
}}
]
}
此外,为了提高可读性,您可以 split the schema into multiple files。
这个在线工具(http://avro4s-ui.landoop.com/)非常实用,可以通过给定的有效json.
生成AVRO schema