类别值为 replaced/updated 的滚动总和
Rolling sum where category values are replaced/updated
我正在尝试计算滚动总和,其中 categories/groups 任何给定日期的金额变化 - 当变化发生时 类别的新值成为滚动总和,但该类别的先前值将被忽略;所以这是一个滚动总和,但仅基于每个类别的最新(在那个时间点)。
示例数据(SumAmount 是试图解决的问题)
txn_id | cust_id | trans_date | Category | amount | SumAmount
-----------------------------------------------------------------
1 | 1 | 2020-01-01| Ball | 5 | 5 --first tran so sum is 5
2 | 1 | 2020-01-02| Cup | 5 | 10 --sum is 10 (ball=5,Cup=5)
3 | 1 | 2020-01-03| Ball | 2 | 7 --sum is 7 (ball=2,Cup=5)
4 | 1 | 2020-02-04| Ball | 4 | 9 --sum is 9 (ball=4,Cup=5)
5 | 1 | 2020-02-05| Ball | 1 | 6 --sum is 6 (ball=1,Cup=5)
6 | 1 | 2020-02-06| Cup | 10| 11 --sum is 11(ball=1,Cup=10)
7 | 1 | 2020-02-07| Phone | 5 | 16 --sum is 16(ball=1,Cup=10,Phone=5)
8 | 1 | 2020-02-08| Cup | 5 | 11 --sum is 11(ball=1,Cup=5,Phone=5)
9 | 1 | 2020-02-09| Ball | 5 | 15 --sum is 15(ball=5,Cup=5,Phone=5)
我已经在游标中完成了这个工作,但想知道是否可以基于 SET
光标如下:
CREATE PROCEDURE [dbo].[PriceHistory](@CustId int, @MaxPriceHistory decimal(16,2) Output)
create table #PriceHistory ( CategoryID uniqueidentifier, Amount decimal(16,2))
declare pricehistory_cursor CURSOR FOR
select CategoryID, Amount
from mytable
where CustId =@CustId
order by trans_date;
declare @CategoryID uniqueidentifier
declare @Amount decimal(16,2)
declare @CurrentTotal decimal(16,2)
set @MaxPriceHistory = 0
open pricehistory_cursor
fetch next from pricehistory_cursor into @CategoryID, @Amount
WHILE @@FETCH_STATUS = 0
BEGIN
if (exists(select * from #PriceHistory where CategoryID = @CategoryID))
update #PriceHistory set Amount = @actualAmount where CategoryID = @CategoryID
else
insert into #PriceHistory(CategoryID,Amount) values (@CategoryID, @Amount)
select @CurrentTotal = sum(Amount) from #PriceHistory
if (@CurrentTotal > @MaxPriceHistory)
set @MaxPriceHistory = @CurrentTotal
fetch next from pricehistory_cursor into @CategoryID, @Amount
END
close pricehistory_cursor
deallocate pricehistory_cursor;
最终,我正在寻找整个交易生命周期中的最大 SumAmount(提供的示例中的 SumAmount 列),在本示例中,它是 16。
我明白光标在做什么,我明白为什么它会这样工作(如果已经存在,则替换该特定类别的金额(这是我对基于 SET 的方法感到困惑的一点,我将如何获得杯子数量为 5,当 txn_id = 5 发生时?),并将其与当时所有其他最新类别数量相加),如果可能的话,我无法理解使用某种递归 CTE 或 ROW_NUMBER.
由于数据位于全新的临时文件中 table,这也意味着主键没有间隙。
对于递归 CTE,这是一个很好的情况。
下面的查询保留了球、杯子和电话的最新数量。
那么求和的计算就简单的看类别了
WITH RCTE_BALL_CUP_PHONE AS
(
SELECT txn_id, cust_id, trans_date, category, amount
, CASE WHEN category = 'Ball' THEN amount ELSE 0 END AS NearestBallAmount
, CASE WHEN category = 'Cup' THEN amount ELSE 0 END AS NearestCupAmount
, CASE WHEN category = 'Phone' THEN amount ELSE 0 END AS NearestPhoneAmount
, amount AS SumAmount
FROM #PriceHistory AS tmp
WHERE txn_id = 1
UNION ALL
SELECT tmp.txn_id, tmp.cust_id, tmp.trans_date, tmp.category, tmp.amount
, CASE WHEN tmp.category = 'Ball' THEN tmp.amount ELSE c.NearestBallAmount END
, CASE WHEN tmp.category = 'Cup' THEN tmp.amount ELSE c.NearestCupAmount END
, CASE WHEN tmp.category = 'Phone' THEN tmp.amount ELSE c.NearestPhoneAmount END
, CASE
WHEN tmp.category = 'Ball' THEN (tmp.amount + c.NearestCupAmount + c.NearestPhoneAmount)
WHEN tmp.category = 'Cup' THEN (tmp.amount + c.NearestBallAmount + c.NearestPhoneAmount)
WHEN tmp.category = 'Phone' THEN (tmp.amount + c.NearestCupAmount + c.NearestBallAmount)
ELSE tmp.Amount
END
FROM RCTE_BALL_CUP_PHONE c
JOIN #PriceHistory AS tmp
ON tmp.txn_id = c.txn_id + 1
)
SELECT txn_id, cust_id, trans_date, category, amount
, SumAmount
FROM RCTE_BALL_CUP_PHONE
ORDER BY txn_id;
txn_id | cust_id | trans_date | category | amount | SumAmount
-----: | ------: | :--------- | :------- | -----: | --------:
1 | 1 | 2020-01-01 | Ball | 5 | 5
2 | 1 | 2020-01-02 | Cup | 5 | 10
3 | 1 | 2020-01-03 | Ball | 2 | 7
4 | 1 | 2020-02-04 | Ball | 4 | 9
5 | 1 | 2020-02-05 | Ball | 1 | 6
6 | 1 | 2020-02-06 | Cup | 10 | 11
7 | 1 | 2020-02-07 | Phone | 5 | 16
8 | 1 | 2020-02-08 | Cup | 5 | 11
9 | 1 | 2020-02-09 | Ball | 5 | 15
db<>fiddle here
为了将来参考,这里是对 lptr 很棒的 JSON 方法的改编。
它将适用于 3 个以上的类别,而无需更改任何内容。
with RCTE as
(
select *, cast(concat('{"', category, '":', amount, '}') as varchar(max)) as j
from #PriceHistory t
where txn_id=1
union all
select t.*, cast(json_modify(cte.j, concat('$.', t.category), t.amount) as varchar(max))
from RCTE cte
join #PriceHistory t on t.txn_id = cte.txn_id+1
)
select txn_id, cust_id, trans_date, category, amount
, (select sum(cast(value as int)) from openjson(j)) as SumAmount
, j
from RCTE
order by txn_id
txn_id | cust_id | trans_date | category | amount | SumAmount | j
-----: | ------: | :--------- | :------- | -----: | --------: | :----------------------------
1 | 1 | 2020-01-01 | Ball | 5 | 5 | {"Ball":5}
2 | 1 | 2020-01-02 | Cup | 5 | 10 | {"Ball":5,"Cup":5}
3 | 1 | 2020-01-03 | Ball | 2 | 7 | {"Ball":2,"Cup":5}
4 | 1 | 2020-02-04 | Ball | 4 | 9 | {"Ball":4,"Cup":5}
5 | 1 | 2020-02-05 | Ball | 1 | 6 | {"Ball":1,"Cup":5}
6 | 1 | 2020-02-06 | Cup | 10 | 11 | {"Ball":1,"Cup":10}
7 | 1 | 2020-02-07 | Phone | 5 | 16 | {"Ball":1,"Cup":10,"Phone":5}
8 | 1 | 2020-02-08 | Cup | 5 | 11 | {"Ball":1,"Cup":5,"Phone":5}
9 | 1 | 2020-02-09 | Ball | 5 | 15 | {"Ball":5,"Cup":5,"Phone":5}
我正在尝试计算滚动总和,其中 categories/groups 任何给定日期的金额变化 - 当变化发生时 类别的新值成为滚动总和,但该类别的先前值将被忽略;所以这是一个滚动总和,但仅基于每个类别的最新(在那个时间点)。
示例数据(SumAmount 是试图解决的问题)
txn_id | cust_id | trans_date | Category | amount | SumAmount
-----------------------------------------------------------------
1 | 1 | 2020-01-01| Ball | 5 | 5 --first tran so sum is 5
2 | 1 | 2020-01-02| Cup | 5 | 10 --sum is 10 (ball=5,Cup=5)
3 | 1 | 2020-01-03| Ball | 2 | 7 --sum is 7 (ball=2,Cup=5)
4 | 1 | 2020-02-04| Ball | 4 | 9 --sum is 9 (ball=4,Cup=5)
5 | 1 | 2020-02-05| Ball | 1 | 6 --sum is 6 (ball=1,Cup=5)
6 | 1 | 2020-02-06| Cup | 10| 11 --sum is 11(ball=1,Cup=10)
7 | 1 | 2020-02-07| Phone | 5 | 16 --sum is 16(ball=1,Cup=10,Phone=5)
8 | 1 | 2020-02-08| Cup | 5 | 11 --sum is 11(ball=1,Cup=5,Phone=5)
9 | 1 | 2020-02-09| Ball | 5 | 15 --sum is 15(ball=5,Cup=5,Phone=5)
我已经在游标中完成了这个工作,但想知道是否可以基于 SET
光标如下:
CREATE PROCEDURE [dbo].[PriceHistory](@CustId int, @MaxPriceHistory decimal(16,2) Output)
create table #PriceHistory ( CategoryID uniqueidentifier, Amount decimal(16,2))
declare pricehistory_cursor CURSOR FOR
select CategoryID, Amount
from mytable
where CustId =@CustId
order by trans_date;
declare @CategoryID uniqueidentifier
declare @Amount decimal(16,2)
declare @CurrentTotal decimal(16,2)
set @MaxPriceHistory = 0
open pricehistory_cursor
fetch next from pricehistory_cursor into @CategoryID, @Amount
WHILE @@FETCH_STATUS = 0
BEGIN
if (exists(select * from #PriceHistory where CategoryID = @CategoryID))
update #PriceHistory set Amount = @actualAmount where CategoryID = @CategoryID
else
insert into #PriceHistory(CategoryID,Amount) values (@CategoryID, @Amount)
select @CurrentTotal = sum(Amount) from #PriceHistory
if (@CurrentTotal > @MaxPriceHistory)
set @MaxPriceHistory = @CurrentTotal
fetch next from pricehistory_cursor into @CategoryID, @Amount
END
close pricehistory_cursor
deallocate pricehistory_cursor;
最终,我正在寻找整个交易生命周期中的最大 SumAmount(提供的示例中的 SumAmount 列),在本示例中,它是 16。
我明白光标在做什么,我明白为什么它会这样工作(如果已经存在,则替换该特定类别的金额(这是我对基于 SET 的方法感到困惑的一点,我将如何获得杯子数量为 5,当 txn_id = 5 发生时?),并将其与当时所有其他最新类别数量相加),如果可能的话,我无法理解使用某种递归 CTE 或 ROW_NUMBER.
由于数据位于全新的临时文件中 table,这也意味着主键没有间隙。
对于递归 CTE,这是一个很好的情况。
下面的查询保留了球、杯子和电话的最新数量。
那么求和的计算就简单的看类别了
WITH RCTE_BALL_CUP_PHONE AS ( SELECT txn_id, cust_id, trans_date, category, amount , CASE WHEN category = 'Ball' THEN amount ELSE 0 END AS NearestBallAmount , CASE WHEN category = 'Cup' THEN amount ELSE 0 END AS NearestCupAmount , CASE WHEN category = 'Phone' THEN amount ELSE 0 END AS NearestPhoneAmount , amount AS SumAmount FROM #PriceHistory AS tmp WHERE txn_id = 1 UNION ALL SELECT tmp.txn_id, tmp.cust_id, tmp.trans_date, tmp.category, tmp.amount , CASE WHEN tmp.category = 'Ball' THEN tmp.amount ELSE c.NearestBallAmount END , CASE WHEN tmp.category = 'Cup' THEN tmp.amount ELSE c.NearestCupAmount END , CASE WHEN tmp.category = 'Phone' THEN tmp.amount ELSE c.NearestPhoneAmount END , CASE WHEN tmp.category = 'Ball' THEN (tmp.amount + c.NearestCupAmount + c.NearestPhoneAmount) WHEN tmp.category = 'Cup' THEN (tmp.amount + c.NearestBallAmount + c.NearestPhoneAmount) WHEN tmp.category = 'Phone' THEN (tmp.amount + c.NearestCupAmount + c.NearestBallAmount) ELSE tmp.Amount END FROM RCTE_BALL_CUP_PHONE c JOIN #PriceHistory AS tmp ON tmp.txn_id = c.txn_id + 1 ) SELECT txn_id, cust_id, trans_date, category, amount , SumAmount FROM RCTE_BALL_CUP_PHONE ORDER BY txn_id;
txn_id | cust_id | trans_date | category | amount | SumAmount -----: | ------: | :--------- | :------- | -----: | --------: 1 | 1 | 2020-01-01 | Ball | 5 | 5 2 | 1 | 2020-01-02 | Cup | 5 | 10 3 | 1 | 2020-01-03 | Ball | 2 | 7 4 | 1 | 2020-02-04 | Ball | 4 | 9 5 | 1 | 2020-02-05 | Ball | 1 | 6 6 | 1 | 2020-02-06 | Cup | 10 | 11 7 | 1 | 2020-02-07 | Phone | 5 | 16 8 | 1 | 2020-02-08 | Cup | 5 | 11 9 | 1 | 2020-02-09 | Ball | 5 | 15
db<>fiddle here
为了将来参考,这里是对 lptr 很棒的 JSON 方法的改编。
它将适用于 3 个以上的类别,而无需更改任何内容。
with RCTE as ( select *, cast(concat('{"', category, '":', amount, '}') as varchar(max)) as j from #PriceHistory t where txn_id=1 union all select t.*, cast(json_modify(cte.j, concat('$.', t.category), t.amount) as varchar(max)) from RCTE cte join #PriceHistory t on t.txn_id = cte.txn_id+1 ) select txn_id, cust_id, trans_date, category, amount , (select sum(cast(value as int)) from openjson(j)) as SumAmount , j from RCTE order by txn_id
txn_id | cust_id | trans_date | category | amount | SumAmount | j -----: | ------: | :--------- | :------- | -----: | --------: | :---------------------------- 1 | 1 | 2020-01-01 | Ball | 5 | 5 | {"Ball":5} 2 | 1 | 2020-01-02 | Cup | 5 | 10 | {"Ball":5,"Cup":5} 3 | 1 | 2020-01-03 | Ball | 2 | 7 | {"Ball":2,"Cup":5} 4 | 1 | 2020-02-04 | Ball | 4 | 9 | {"Ball":4,"Cup":5} 5 | 1 | 2020-02-05 | Ball | 1 | 6 | {"Ball":1,"Cup":5} 6 | 1 | 2020-02-06 | Cup | 10 | 11 | {"Ball":1,"Cup":10} 7 | 1 | 2020-02-07 | Phone | 5 | 16 | {"Ball":1,"Cup":10,"Phone":5} 8 | 1 | 2020-02-08 | Cup | 5 | 11 | {"Ball":1,"Cup":5,"Phone":5} 9 | 1 | 2020-02-09 | Ball | 5 | 15 | {"Ball":5,"Cup":5,"Phone":5}