presto/aws athena - select 最新版本记录

presto/aws athena - select latest version of a record

我有一个订单事件 table,每个订单在完成时包含很少的条目。有些订单被取消或退款。我正在尝试 select 所有订单的最新版本,最新版本的状态为 'OrderConfirmed' 我假设我会使用以下 SQL 但 AWS Athena 抱怨 Column 'latest_order_update.latest_update' 无法解决。有什么线索吗?

WITH latest_order_update AS (
  SELECT orderevent_order.unique_id, MAX(orderevent_order.event_time) AS latest_update
  FROM orderevent_order
  GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time = latest_order_update.latest_update AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;

你可以用ROW_NUMBER重写它:

WITH cte AS (
 SELECT oo.unique_id,
    ,ROW_NUMBER() OVER(PARTITION BY unique_id ORDER BY event_time DESC) rn
  FROM orderevent_order oo
)
SELECT * 
FROM cte
WHERE rn = 1;

或参考FROM/JOIN/子查询中的cte:

WITH latest_order_update AS (
  SELECT orderevent_order.unique_id, 
     MAX(orderevent_order.event_time) AS latest_update
  FROM orderevent_order
  GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time IN (SELECT l.latest_update 
                                      FROM latest_order_update l
                                      WHERE orderevent_order.unique_id 
                                         = l.unique_id)           
  AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;

加入:

WITH latest_order_update AS (
  SELECT orderevent_order.unique_id, 
     MAX(orderevent_order.event_time) AS latest_update
  FROM orderevent_order
  GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
JOIN latest_order_update
  ON orderevent_order.event_time = latest_order_update.latest_update
 AND orderevent_order.unique_id = latest_order_update.unique_id
WHERE orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;