presto/aws athena - select 最新版本记录
presto/aws athena - select latest version of a record
我有一个订单事件 table,每个订单在完成时包含很少的条目。有些订单被取消或退款。我正在尝试 select 所有订单的最新版本,最新版本的状态为 'OrderConfirmed' 我假设我会使用以下 SQL 但 AWS Athena 抱怨 Column 'latest_order_update.latest_update' 无法解决。有什么线索吗?
WITH latest_order_update AS (
SELECT orderevent_order.unique_id, MAX(orderevent_order.event_time) AS latest_update
FROM orderevent_order
GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time = latest_order_update.latest_update AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;
你可以用ROW_NUMBER
重写它:
WITH cte AS (
SELECT oo.unique_id,
,ROW_NUMBER() OVER(PARTITION BY unique_id ORDER BY event_time DESC) rn
FROM orderevent_order oo
)
SELECT *
FROM cte
WHERE rn = 1;
或参考FROM/JOIN
/子查询中的cte:
WITH latest_order_update AS (
SELECT orderevent_order.unique_id,
MAX(orderevent_order.event_time) AS latest_update
FROM orderevent_order
GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time IN (SELECT l.latest_update
FROM latest_order_update l
WHERE orderevent_order.unique_id
= l.unique_id)
AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;
加入:
WITH latest_order_update AS (
SELECT orderevent_order.unique_id,
MAX(orderevent_order.event_time) AS latest_update
FROM orderevent_order
GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
JOIN latest_order_update
ON orderevent_order.event_time = latest_order_update.latest_update
AND orderevent_order.unique_id = latest_order_update.unique_id
WHERE orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;
我有一个订单事件 table,每个订单在完成时包含很少的条目。有些订单被取消或退款。我正在尝试 select 所有订单的最新版本,最新版本的状态为 'OrderConfirmed' 我假设我会使用以下 SQL 但 AWS Athena 抱怨 Column 'latest_order_update.latest_update' 无法解决。有什么线索吗?
WITH latest_order_update AS (
SELECT orderevent_order.unique_id, MAX(orderevent_order.event_time) AS latest_update
FROM orderevent_order
GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time = latest_order_update.latest_update AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;
你可以用ROW_NUMBER
重写它:
WITH cte AS (
SELECT oo.unique_id,
,ROW_NUMBER() OVER(PARTITION BY unique_id ORDER BY event_time DESC) rn
FROM orderevent_order oo
)
SELECT *
FROM cte
WHERE rn = 1;
或参考FROM/JOIN
/子查询中的cte:
WITH latest_order_update AS (
SELECT orderevent_order.unique_id,
MAX(orderevent_order.event_time) AS latest_update
FROM orderevent_order
GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time IN (SELECT l.latest_update
FROM latest_order_update l
WHERE orderevent_order.unique_id
= l.unique_id)
AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;
加入:
WITH latest_order_update AS (
SELECT orderevent_order.unique_id,
MAX(orderevent_order.event_time) AS latest_update
FROM orderevent_order
GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
JOIN latest_order_update
ON orderevent_order.event_time = latest_order_update.latest_update
AND orderevent_order.unique_id = latest_order_update.unique_id
WHERE orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;