Amazon Redshift 避免使用递归 CTE 来填充空值

Amazon Redshift avoid recursive CTE for filling null values with previous

我在 SQL 中找到了一个常见问题的很好的解决方案,就在这里:

我唯一的问题是 Amazon Redshift 不支持递归 CTE,有什么方法可以以不同的方式重写这部分代码并避免对 CleanCust 的递归?

/* Test Data & Table */
DECLARE @Customers TABLE
    (Dates datetime,
     Customer integer,
     Value integer) 

    INSERT  INTO @Customers
    VALUES  ('20100101', 1, 12),
        ('20100101', 2, NULL),
        ('20100101', 3, 32),
        ('20100101', 4, 42),
        ('20100101', 5, 15),
        ('20100102', 1, NULL),
        ('20100102', 2, NULL),
        ('20100102', 3, 39),
        ('20100102', 4, NULL),
        ('20100102', 5, 16),
        ('20100103', 1, 13),
        ('20100103', 2, 24),
        ('20100103', 3, NULL),
        ('20100103', 4, NULL),
        ('20100103', 5, 21),
        ('20100104', 1, 14),
        ('20100104', 2, NULL),
        ('20100104', 3, NULL),
        ('20100104', 4, 65),
        ('20100104', 5, 23) ;

/* CustCTE - This gives us a RowNum to allow us to build the recursive CTE CleanCust */
WITH    CustCTE
          AS (SELECT    Customer,
                        Value,
                        Dates,
                        ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Dates) RowNum
              FROM      @Customers),

/* CleanCust - A recursive CTE. This runs down the list of values for each customer, checking the Value column, if it is null it gets the previous non NULL value.*/
        CleanCust
          AS (SELECT    Customer,
                        ISNULL(Value, 0) Value, /* Ensure we start with no NULL values for each customer */
                        Dates,
                        RowNum
              FROM      CustCte cur
              WHERE     RowNum = 1
              UNION ALL
              SELECT    Curr.Customer,
                        ISNULL(Curr.Value, prev.Value) Value,
                        Curr.Dates,
                        Curr.RowNum
              FROM      CustCte curr
              INNER JOIN CleanCust prev ON curr.Customer = prev.Customer
                                           AND curr.RowNum = prev.RowNum + 1)

所需的输出如下,在“必需”列中:

Date    Customer   Value   Required   Rule
20100101       1      12         12
20100101       2                  0   If no value assign 0
20100101       3      32         32
20100101       4      42         42
20100101       5      15         15
20100102       1                 12   Take last known value
20100102       2                  0   Take last known value
20100102       3      39         39
20100102       4                 42   Take last known value
20100102       5      16         16
20100103       1      13         13
20100103       2      24         24
20100103       3                 39   Take last known value
20100103       4                 42   Take last known value
20100103       5      21         21
20100104       1      14         14
20100104       2                 24   Take last known value
20100104       3                 39   Take last known value
20100104       4      65         65
20100104       5      23         23

使用 运行 总和根据 null 值的出现设置组。然后获取该组的 max 值。

select dates,customer,val,coalesce(max(val) over(partition by customer,grp),0) as required
from (select dates,customer,val,
      sum(case when val is null then 0 else 1 end) 
      over(partition by customer order by dates rows unbounded preceding) as grp
      from customers
     ) t