将总计行的值拆分为其他多个行，直到总和达到 REDSHIFT 中总计行的值

Question

CREATE TABLE inbound (
    id SERIAL PRIMARY KEY,
    campaign VARCHAR,
    expected_inbound_date DATE,
    expected_inbound_quantity DECIMAL,
    received_inbound_quantity DECIMAL
);

INSERT INTO inbound
(campaign, expected_inbound_date, expected_inbound_quantity, received_inbound_quantity)
VALUES 
('C001', '2022-05-03', '500', '0'),
('C001', '2022-05-03', '800', '0'),
('C001', '2022-05-03', '400', '0'),
('C001', '2022-05-03', '200', '0'),
('C001', NULL, '0', '700'),

('C002', '2022-08-20', '3000', '0'),
('C002', '2022-08-20', '5000', '0'),
('C002', '2022-08-20', '2800', '0'),
('C002', NULL, '0', '4000');

预期结果

campaign |  expected_inbound_date |  expected_inbound_quantity  |  split_received_inbound_quantity
---------|------------------------|-----------------------------|----------------------------------
  C001   |        2022-05-03      |             200             |          200
  C001   |        2022-05-03      |             400             |          400
  C001   |        2022-05-03      |             500             |          100
  C001   |        2022-05-03      |             800             |            0
  C001   |                        |                             |          700
---------|------------------------|-----------------------------|----------------------------------
  C002   |       2022-08-20       |           3.800             |         3.800
  C002   |       2022-08-20       |           5.000             |           200
  C002   |       2022-08-20       |           2.800             |             0
  C002   |                        |                             |         4.000

我想将 received_inbound_quantity 拆分到 expected_inbound_quantity 的每一行，直到达到 received_inbound_quantity 的总数。
参考中的答案，我尝试采用此解决方案：

SELECT
i.campaign AS campaign,
i.expected_inbound_date AS expected_inbound_date,
i.expected_inbound_quantity AS expected_inbound_quantity,
i.received_inbound_quantity AS received_inbound_quantity,

(SELECT 
   GREATEST(
     LEAST(i.expected_inbound_quantity, 
          (SELECT 
           SUM(i3.received_inbound_quantity) 
           FROM inbound i3 
           WHERE i.campaign = i3.campaign)  -
           
            (
                SELECT 
                t1.cumulated_value AS cumulated_value 
                FROM
                
                   (SELECT
                    i2.campaign, 
                    i2.expected_inbound_date, 
                    i2.expected_inbound_quantity, 
                    i2.received_inbound_quantity,
                    SUM(i2.expected_inbound_quantity) OVER (PARTITION BY i2.campaign ORDER BY i2.expected_inbound_date, i2.expected_inbound_quantity, i2.received_inbound_quantity ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS cumulated_value
                    FROM inbound i2
                    GROUP BY 1,2,3,4) t1
                    
                WHERE (t1.campaign, t1.expected_inbound_date, t1.expected_inbound_quantity, t1.received_inbound_quantity) = (i.campaign, i.expected_inbound_date, i.expected_inbound_quantity, i.received_inbound_quantity)
            )
            
        ),
        0
   )
) AS split

FROM inbound i
GROUP BY 1,2,3,4
ORDER BY 1,2,3,4

然而，在 redshift 中我得到错误：

Invalid operation: This type of correlated subquery pattern is not supported yet;

我需要如何修改查询才能使其在 redshift 中也能正常工作？

Answer 1

Window 函数是你的朋友。当您有比较行的查询时，您应该首先查看 Redshift 上的 window 函数。这比任何自连接模式都更简单、更干净、更快。

select 
  campaign,
  expected_inbound_date,
  expected_inbound_quantity,
  received_inbound_quantity,
  case when (inbound_total - inbound_sum) >= 0 then expected_inbound_quantity
       else case when (expected_inbound_quantity + inbound_total - inbound_sum) >= 0 then expected_inbound_quantity + inbound_total - inbound_sum
                else 0 end
    end as split

from (SELECT
  campaign,
  expected_inbound_date,
  expected_inbound_quantity,
  received_inbound_quantity,
  sum(expected_inbound_quantity) over (partition by campaign order by expected_inbound_date, expected_inbound_quantity) as inbound_sum,
  max(received_inbound_quantity) over (partition by campaign) as inbound_total

  FROM inbound i
) subq
ORDER BY 1,2,3,4;

已更新 fiddle 此处 - https://dbfiddle.uk/?rdbms=postgres_13&fiddle=2381abdf5a90a997a4f05b809c892c40

将其移植到 Redshift 时，您可能希望将 CASE 语句转换为 DECODE() 函数，因为恕我直言，这些函数更具可读性。

PS。感谢您设置 fiddle，因为这大大加快了提供答案的速度。

将总计行的值拆分为其他多个行，直到总和达到 REDSHIFT 中总计行的值

Split value from a total row to multiple other rows until the sum reaches the value of the total row in REDSHIFT

sql

amazon-web-services

amazon-redshift