蜂巢中复杂结构的聚合
Aggregations in complex structure in hive
我在 hive
中创建了一个 table,在一列中包含 complex structure
。
示例记录:
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| id | order | |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| 1 | [{"books":"prince","order_timestamp":"2022-01-19 00:45:22","check_out_timestamp":"2022-01-19 00:45:22"},{"books":"venceremos","order_timestamp":"2022-01-19 00:47:13","check_out_timestamp":null},{"books":"rich dad poor dad","order_timestamp":null,"check_out_timestamp":"2022-01-19 00:47:13"}] | |
| 2 | [{"books":"lord of flies","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 13:08:20"},{"books":"test","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 12:47:14"},{"books":"physics","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 12:47:14"}] | |
| 3 | [{"books":"test","order_timestamp":"2022-01-14 18:21:03","check_out_timestamp":"2022-01-14 18:21:03"},{"books":"up and down","order_timestamp":"2022-01-14 18:23:21","check_out_timestamp":"2022-01-14 18:23:21.018"},{"books":"mathematics","order_timestamp":"2022-01-14 18:23:21","check_out_timestamp":"2022-01-14 18:23:21"}] | |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
我想问一下,如何在这个列中进行一些聚合?
例如,22 年 1 月 14 日订购了多少本书或最畅销的书?
使用lateral view inline
分解结构数组并计算。例如,每个日期订购了多少本书:
select date(e.order_timestamp) order_date
count(*) as book_cnt
from table_name
lateral view inline(order) e as books, order_timestamp, check_out_timestamp
where date(e.order_timestamp) = date('2022-01-14')
group by date(e.order_timestamp)
我在 hive
中创建了一个 table,在一列中包含 complex structure
。
示例记录:
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| id | order | |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| 1 | [{"books":"prince","order_timestamp":"2022-01-19 00:45:22","check_out_timestamp":"2022-01-19 00:45:22"},{"books":"venceremos","order_timestamp":"2022-01-19 00:47:13","check_out_timestamp":null},{"books":"rich dad poor dad","order_timestamp":null,"check_out_timestamp":"2022-01-19 00:47:13"}] | |
| 2 | [{"books":"lord of flies","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 13:08:20"},{"books":"test","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 12:47:14"},{"books":"physics","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 12:47:14"}] | |
| 3 | [{"books":"test","order_timestamp":"2022-01-14 18:21:03","check_out_timestamp":"2022-01-14 18:21:03"},{"books":"up and down","order_timestamp":"2022-01-14 18:23:21","check_out_timestamp":"2022-01-14 18:23:21.018"},{"books":"mathematics","order_timestamp":"2022-01-14 18:23:21","check_out_timestamp":"2022-01-14 18:23:21"}] | |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
我想问一下,如何在这个列中进行一些聚合? 例如,22 年 1 月 14 日订购了多少本书或最畅销的书?
使用lateral view inline
分解结构数组并计算。例如,每个日期订购了多少本书:
select date(e.order_timestamp) order_date
count(*) as book_cnt
from table_name
lateral view inline(order) e as books, order_timestamp, check_out_timestamp
where date(e.order_timestamp) = date('2022-01-14')
group by date(e.order_timestamp)