根据 SQL 中的先前值创建计算列

To create a calculated column based on its previous value in SQL

我有一个很难解决的问题,我已经解决了好几天了。我们的数据仓库是 Redshift。这对 python 等人来说很容易,但是在 SQL 中构建它让我抓狂。

包含周数、总补货(进货的额外库存)和估计售出单位(库存充足时的理想预测售出单位)的示例数据库:

SELECT 'W1' AS weeknum, 0 AS replenish, 20 AS est_units_sold
UNION ALL (SELECT 'W2' AS weeknum, 0 AS replenish, 20 AS est_units_sold)
UNION ALL (SELECT 'W3' AS weeknum, 0 AS replenish, 20 AS est_units_sold)
UNION ALL (SELECT 'W4' AS weeknum, 50 AS replenish, 20 AS est_units_sold)
UNION ALL (SELECT 'W5' AS weeknum, 0 AS replenish, 20 AS est_units_sold)
UNION ALL (SELECT 'W6' AS weeknum, 0 AS replenish, 30 AS est_units_sold)
UNION ALL (SELECT 'W7' AS weeknum, 0 AS replenish, 30 AS est_units_sold)
UNION ALL (SELECT 'W8' AS weeknum, 30 AS replenish, 20 AS est_units_sold)
UNION ALL (SELECT 'W9' AS weeknum, 0 AS replenish, 20 AS est_units_sold);

数据是这样的

W1  0   20
W2  0   20
W3  0   20
W4  50  20
W5  0   20
W6  0   30
W7  0   30
W8  30  20
W9  0   20

我需要创建的是每周期初库存一栏,给W1期初库存(基本上是今天的库存),比如30台。

索多码:

Week(n) inventory = Week(n-1) inventory - MIN(Week(n-1) inventory, Week(n-1) est_units_sold) + Week(n) replenish

MIN(Week(n-1) inventory, Week(n-1) est_units_sold) 部分是关于考虑库存的实际售出单位,比如如果我们的库存只有 10 个,而理想的预测售出单位是 20 个,那么我们只会售出 10 个。 我坚持的是在创建 inventory col 时,公式必须在前一行中引用自身。我无法绕过这个拦截器。

想要的结果:

通过整数的化简weeknum(也可以用'W?'这样的字符串值来解决),你可以用递归cte来完成:

WITH RECURSIVE cte AS (
  SELECT *, 30 AS inv FROM data WHERE weeknum = 1
  UNION ALL
  SELECT d.*,
         c.inv - LEAST(c.inv, c.est_units_sold) + d.replenish
  FROM data d INNER JOIN cte c
  ON c.weeknum = d.weeknum - 1
)
SELECT * FROM cte;

参见demo

对于您的示例数据:

WITH RECURSIVE cte AS (
  SELECT *, 30 AS inv FROM data WHERE weeknum = 'W1'
  UNION ALL
  SELECT d.*,
         c.inv - LEAST(c.inv, c.est_units_sold) + d.replenish
  FROM data d INNER JOIN cte c
  ON SUBSTRING(c.weeknum, 2)::int = SUBSTRING(d.weeknum, 2)::int - 1
)
SELECT * FROM cte;

参见demo

为了避免递归,递归对于大型数据集来说可能很慢,您需要展开逻辑。这是一个使用 window 函数执行查询的解决方案。

设置:

create table test as (
  SELECT 'W1' AS weeknum, 0 AS replenish, 20 AS est_units_sold, 30 as inventory
UNION ALL (SELECT 'W2' AS weeknum, 0 AS replenish, 20 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W3' AS weeknum, 0 AS replenish, 20 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W4' AS weeknum, 50 AS replenish, 20 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W5' AS weeknum, 0 AS replenish, 20 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W6' AS weeknum, 0 AS replenish, 30 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W7' AS weeknum, 0 AS replenish, 30 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W8' AS weeknum, 30 AS replenish, 20 AS est_units_sold, null as inventory)
UNION ALL (SELECT 'W9' AS weeknum, 0 AS replenish, 20 AS est_units_sold, null as inventory)
);

查询看起来像(在结果中留下中间计算,以便您可以看到逻辑):

select *,
    30 + tot_replen - tot_sold - 
        min(overage) over(order by weeknum rows unbounded preceding) as inventory
from ( select weeknum, replenish, est_units_sold,
    coalesce(sum(est_units_sold) over (order by weeknum rows between unbounded preceding and 1 preceding), 0) as tot_sold,
    sum(replenish) over(order by weeknum rows unbounded preceding) as tot_replen,
    least(coalesce( inventory, 
             sum(coalesce(inventory,0)) over (order by weeknum rows between unbounded preceding and 1 preceding) - 
             sum(est_units_sold) over (order by weeknum rows between unbounded preceding and 1 preceding) + 
             sum(replenish) over(order by weeknum rows between unbounded preceding and 1 preceding)
            ), 0) as overage
from test) as sub
;