确定与另一个日期值 teradata 最接近的日期

Determine closest date to another date value teradata

我的数据集如下所示。对于 customerid、orderid 和发货日期的每个组合,我想检索 1 个小于或等于发货日期的处理日期。如果处理日期大于发货日期且不存在更小的处理日期,则使用发货日期作为处理日期

+-------------+----------+------------+--------------+--+
| Customer ID | Order ID | Ship Date  | Process Date |  |
+-------------+----------+------------+--------------+--+
|        1000 |      100 | 9/17/2020  | 9/17/2020    |  |
|        1000 |      100 | 9/17/2020  | 10/16/2020   |  |
|        1000 |      100 | 9/17/2020  | 9/16/2020    |  |
|        2000 |      200 | 8/15/2020  | 8/13/2020    |  |
|        2000 |      300 | 10/14/2020 | 10/13/2020   |  |
|        3000 |      400 | 3/4/2020   | 4/2/2020     |  |
|        3000 |      400 | 3/4/2020   | 3/3/2020     |  |
|        3000 |      400 | 3/4/2020   | 3/5/2020     |  |
|        4000 |      500 | 5/1/2020   | 5/3/2020     |  |
|        5000 |      600 | 6/1/2020   | 7/1/2020     |  |
|        5000 |      600 | 6/1/2020   | 7/2/2020  
|        6000 |      700 | 7/14/2020  | 7/13/2020    |  |
|        6000 |      700 | 7/14/2020  | 6/10/2020    |  |
+-------------+----------+------------+--------------+--+   |  |
    +-------------+----------+------------+--------------+--+

期望的输出

+-------------+----------+------------+--------------+--+
| Customer ID | Order ID | Ship Date  | Process Date |  |
+-------------+----------+------------+--------------+--+
|        1000 |      100 | 9/17/2020  | 9/17/2020    |  |
|        2000 |      200 | 8/15/2020  | 8/13/2020    |  |
|        2000 |      300 | 10/14/2020 | 10/13/2020   |  |
|        3000 |      400 | 3/4/2020   | 3/3/2020     |  |
|        4000 |      500 | 5/1/2020   | 5/1/2020     |  |
|        5000 |      600 | 6/1/2020   | 6/1/2020     |  |
|        6000 |      700 | 7/14/2020  | 7/13/2020    |  |
+-------------+----------+------------+--------------+--+

我尝试使用 ROWNUM 和日期差异,但在获得升序的行号后我卡住了 order.Not 确定如何继续前进。

我想你想要过滤 row_number():

select t.*
from (select t.*,
             row_number() over (partition by customer_id, order_id, ship_date order by process_date desc) as seqnum
      from t
      where process_date <= ship_date
     ) t
where seqnum = 1;

我不确定 partition by 子句中是否真的需要 customer_idship_dateorder_id 似乎足够了。

"如果处理日期大于发货日期且不存在更小的处理日期,则使用发货日期作为处理日期。"

做一个GROUP BY。您可以使用 MAX() 到 return 最新的 ProcessDate <= ShipDate。如果不存在这样的 ProcessDate,return ShipDate.

select CustomerID, orderID, ShipDate,
       coalesce(MAX(case when ProcessDate <= ShipDate then ProcessDate end), ShipDate)
from tablename
group by CustomerID, orderID, ShipDate

这应该return预期结果:

select CustomerID, orderID, ShipDate, 
   -- If the process date is greater than the ship date and no lower 
   -- process date exist, then use the ship date as the process date
   least(ProcessDate, ShipDate)
from tablename
qualify
   -- retrieve 1 process date that is less than or equal to the ship date
   row_number() 
   over (partition by CustomerID, orderI
         order by case when ProcessDate <= ShipDate then ProcessDate end desc nulls last) = 1