使用 Amazon Athena 创建 table 和查询 json 数据?
Create table and query json data using Amazon Athena?
我想使用 Amazon Athena 查询 JSON 格式的数据:
[{"id":"0581b7c92be",
"key":"0581b7c92be",
"value":{"rev":"1-ceeeecaa040"},
"doc":{"_id":"0581b7c92be497d19e5ab51e577ada12","_rev":"1ceeeecaa04","node":"belt","DeviceId":"C001"}},
{"id":"0581b7c92be49",
"key":"0581b7c92be497d19e5",
"value":{"rev":"1-ceeeecaa04031842d3ca"},
"doc":{"_id":"0581b7c92be497","_rev":"1ceeeecaa040318","node":"belt","DeviceId":"C001"}
}
]
Athena DDL 基于 Hive,因此您希望数组中的每个 json 对象都在单独的行中:
{"id": "0581b7c92be", "key": "0581b7c92be", "value": {"rev": "1-ceeeecaa040"}, "doc": {"_id": "0581b7c92be497d19e5ab51e577ada12", "_rev": "1ceeeecaa04", "node": "belt", "DeviceId": "C001"} }
{"id": "0581b7c92be49", "key": "0581b7c92be497d19e5", "value": {"rev": "1-ceeeecaa04031842d3ca"}, "doc": {"_id": "0581b7c92be497", "_rev": "1ceeeecaa040318", "node": "belt", "DeviceId": "C001"} }
您可能会遇到嵌套字段 ("value","doc") 的问题,因此,如果您可以将 json 扁平化,事情就会变得更容易。 (参见示例:Hive for complex nested Json)
我想使用 Amazon Athena 查询 JSON 格式的数据:
[{"id":"0581b7c92be",
"key":"0581b7c92be",
"value":{"rev":"1-ceeeecaa040"},
"doc":{"_id":"0581b7c92be497d19e5ab51e577ada12","_rev":"1ceeeecaa04","node":"belt","DeviceId":"C001"}},
{"id":"0581b7c92be49",
"key":"0581b7c92be497d19e5",
"value":{"rev":"1-ceeeecaa04031842d3ca"},
"doc":{"_id":"0581b7c92be497","_rev":"1ceeeecaa040318","node":"belt","DeviceId":"C001"}
}
]
Athena DDL 基于 Hive,因此您希望数组中的每个 json 对象都在单独的行中:
{"id": "0581b7c92be", "key": "0581b7c92be", "value": {"rev": "1-ceeeecaa040"}, "doc": {"_id": "0581b7c92be497d19e5ab51e577ada12", "_rev": "1ceeeecaa04", "node": "belt", "DeviceId": "C001"} }
{"id": "0581b7c92be49", "key": "0581b7c92be497d19e5", "value": {"rev": "1-ceeeecaa04031842d3ca"}, "doc": {"_id": "0581b7c92be497", "_rev": "1ceeeecaa040318", "node": "belt", "DeviceId": "C001"} }
您可能会遇到嵌套字段 ("value","doc") 的问题,因此,如果您可以将 json 扁平化,事情就会变得更容易。 (参见示例:Hive for complex nested Json)