如何使用 window 函数根据日期或排名列聚合数据?
How to use window function to aggregate data based on date or rank column?
所以我有一个发货清单,我有订单总数和每个单独发货的总数,但我正在努力想出代码来为累计发货创建一个额外的列,其中包括当前装运,加上该订单之前的所有装运。这是我目前的结果:
OrderNo
ShipDate
OrderTotal
Shipment Total
Cumulative Shipments
Rank
22396
2022-04-04
639,964
2,983
639,966
3
22396
2022-03-31
639,964
5,626
639,966
2
22396
2022-02-24
639,964
631,355
639,966
1
所以这些是同一订单的 3 批不同货件。第 3 行中的第一次发货是正确的,但我需要第 2 行的累计发货量列是两者的发货量总和,因此 $631,555 + 5,626。按照同样的逻辑,第 1 行应该是所有 3 行的总和,此时等于订单总额 639,964 美元。这就是它的样子:
OrderNo
ShipDate
OrderTotal
Shipment Total
Cumulative Shipments
Rank
22396
2022-04-04
639,964
2,983
639,964
3
22396
2022-03-31
639,964
5,626
636,981
2
22396
2022-02-24
639,964
631,355
631,355
1
我假设完成此操作的最佳方法是使用 over(partition by ()),但我正在努力想出代码。这是我目前拥有的:
SELECT
OrderNo,
ShipDate,
OrderTotal,
[Shipment Total],
SUM([Shipment Total]) OVER(PARTITION BY OrderNo) AS [Cumulative Shipments],
[Rank]
FROM Shipments
WHERE OrderNo = '22396'
[Rank] 列来自早期的 CTE,它根据发货日期计算该货件的排名:
ROW_NUMBER() OVER(PARTITION BY d.OrderNo ORDER BY d.ShipDate) AS [Rank]
我需要 SUM([Shipment Total]) 之类的东西,其中排名等于或小于当前排名。我确定日期列可以完成同样的事情,但不确定如何完成查询
您似乎已经完成了一半,只是缺少功能累计总和的排序标准,例如
select *,
Sum(ShipmentTotal)
over(partition by OrderNo
order by ShipDate rows between unbounded preceding and current row)
from Shipments;
我没有费心构建 CTE,但假设你有一些 PK,你可以 self-join:
SELECT s.ShipmentsID, SUM(cumulative.ShipmentTotal) AS Sum
FROM Shipments s
LEFT JOIN Shipments cumulative ON cumulative.ShipDate <= s.ShipDate
GROUP BY s.ShipmentsID
ShipmentsID
Sum
1
631355.52
2
636982.29
3
639966.04
所以我有一个发货清单,我有订单总数和每个单独发货的总数,但我正在努力想出代码来为累计发货创建一个额外的列,其中包括当前装运,加上该订单之前的所有装运。这是我目前的结果:
OrderNo | ShipDate | OrderTotal | Shipment Total | Cumulative Shipments | Rank |
---|---|---|---|---|---|
22396 | 2022-04-04 | 639,964 | 2,983 | 639,966 | 3 |
22396 | 2022-03-31 | 639,964 | 5,626 | 639,966 | 2 |
22396 | 2022-02-24 | 639,964 | 631,355 | 639,966 | 1 |
所以这些是同一订单的 3 批不同货件。第 3 行中的第一次发货是正确的,但我需要第 2 行的累计发货量列是两者的发货量总和,因此 $631,555 + 5,626。按照同样的逻辑,第 1 行应该是所有 3 行的总和,此时等于订单总额 639,964 美元。这就是它的样子:
OrderNo | ShipDate | OrderTotal | Shipment Total | Cumulative Shipments | Rank |
---|---|---|---|---|---|
22396 | 2022-04-04 | 639,964 | 2,983 | 639,964 | 3 |
22396 | 2022-03-31 | 639,964 | 5,626 | 636,981 | 2 |
22396 | 2022-02-24 | 639,964 | 631,355 | 631,355 | 1 |
我假设完成此操作的最佳方法是使用 over(partition by ()),但我正在努力想出代码。这是我目前拥有的:
SELECT
OrderNo,
ShipDate,
OrderTotal,
[Shipment Total],
SUM([Shipment Total]) OVER(PARTITION BY OrderNo) AS [Cumulative Shipments],
[Rank]
FROM Shipments
WHERE OrderNo = '22396'
[Rank] 列来自早期的 CTE,它根据发货日期计算该货件的排名:
ROW_NUMBER() OVER(PARTITION BY d.OrderNo ORDER BY d.ShipDate) AS [Rank]
我需要 SUM([Shipment Total]) 之类的东西,其中排名等于或小于当前排名。我确定日期列可以完成同样的事情,但不确定如何完成查询
您似乎已经完成了一半,只是缺少功能累计总和的排序标准,例如
select *,
Sum(ShipmentTotal)
over(partition by OrderNo
order by ShipDate rows between unbounded preceding and current row)
from Shipments;
我没有费心构建 CTE,但假设你有一些 PK,你可以 self-join:
SELECT s.ShipmentsID, SUM(cumulative.ShipmentTotal) AS Sum
FROM Shipments s
LEFT JOIN Shipments cumulative ON cumulative.ShipDate <= s.ShipDate
GROUP BY s.ShipmentsID
ShipmentsID | Sum |
---|---|
1 | 631355.52 |
2 | 636982.29 |
3 | 639966.04 |