使用 OpenX SerDe 在 Athena 中聚合 JSON 个对象自身的键值属性
Aggregate JSON object's own key value attributes in Athena using OpenX SerDe
我有一个 JSON 结构,看起来类似于以下两个示例事件:
事件 1
{
"event":{
"type" : "FooBarEvent"
"kv":{
"key1":"value1",
"key2":"value2",
"3":"three",
"d":"4"
}
}
}
活动 2
{
"event":{
"type" : "FooBarEvent"
"kv":{
"key1":"value1",
"key2":"value2000",
"e": "4"
}
}
}
请注意,我事先不知道要输入哪些键和值,我想聚合(计数)它们。这两个事件的输出如下所示:
+-----------+------+-----------+--------+
| EventType | Key | Value | Amount |
+-----------+------+-----------+--------+
| Foobar | key1 | value1 | 2 |
+-----------+------+-----------+--------+
| Foobar | key2 | value1 | 1 |
+-----------+------+-----------+--------+
| Foobar | key2 | value2000 | 1 |
+-----------+------+-----------+--------+
| Foobar | 3 | three | 1 |
+-----------+------+-----------+--------+
| Foobar | d | 4 | 1 |
+-----------+------+-----------+--------+
| Foobar | e | 4 | 1 |
+-----------+------+-----------+--------+
有没有办法在不改变 JSON 结构的情况下在 Athena 中完成此操作?如何映射和 flatten/query 结构最好?
您好,它应该可以使用 UNNEST
功能并将 kv
转换为地图。假设您的数据存储在名为 json_data
的 table 中,以下查询应该有效
with data_formated as
(
select *
,json_extract_scalar(json_field,'$.event.type') event_type
,cast(json_extract(json_field,'$.event.kv') as map(varchar,varchar)) key_value
from json_data
)
,unnesting_data as
(
select *
from data_formated
cross join unnest(key_value) as t (k,v)
)
select event_type,k,v,count(1) amount
from unnesting_data
group by 1,2,3
order by 1,2,3;
我有一个 JSON 结构,看起来类似于以下两个示例事件:
事件 1
{
"event":{
"type" : "FooBarEvent"
"kv":{
"key1":"value1",
"key2":"value2",
"3":"three",
"d":"4"
}
}
}
活动 2
{
"event":{
"type" : "FooBarEvent"
"kv":{
"key1":"value1",
"key2":"value2000",
"e": "4"
}
}
}
请注意,我事先不知道要输入哪些键和值,我想聚合(计数)它们。这两个事件的输出如下所示:
+-----------+------+-----------+--------+
| EventType | Key | Value | Amount |
+-----------+------+-----------+--------+
| Foobar | key1 | value1 | 2 |
+-----------+------+-----------+--------+
| Foobar | key2 | value1 | 1 |
+-----------+------+-----------+--------+
| Foobar | key2 | value2000 | 1 |
+-----------+------+-----------+--------+
| Foobar | 3 | three | 1 |
+-----------+------+-----------+--------+
| Foobar | d | 4 | 1 |
+-----------+------+-----------+--------+
| Foobar | e | 4 | 1 |
+-----------+------+-----------+--------+
有没有办法在不改变 JSON 结构的情况下在 Athena 中完成此操作?如何映射和 flatten/query 结构最好?
您好,它应该可以使用 UNNEST
功能并将 kv
转换为地图。假设您的数据存储在名为 json_data
with data_formated as
(
select *
,json_extract_scalar(json_field,'$.event.type') event_type
,cast(json_extract(json_field,'$.event.kv') as map(varchar,varchar)) key_value
from json_data
)
,unnesting_data as
(
select *
from data_formated
cross join unnest(key_value) as t (k,v)
)
select event_type,k,v,count(1) amount
from unnesting_data
group by 1,2,3
order by 1,2,3;