如何使用 windows 函数按每个付款部分的最短日期对每个付款部分进行分组
How to group each section of payments by the min date for each section the payment was made with windows functions
我有 table 存储付款更改记录。
因此,每次更改支付方式时,都会存储使用的支付方式和日期。数据是批量来的,但我只抓取了新付款使用的第一个日期。
CREATE TABLE #payments
(
pay_ID uniqueidentifier,
pay_type int,
pay_account varchar(max),
pay_routing varchar(max),
pay_date datetime
);
DECLARE @payID uniqueidentifier = newid();
--Actual payments made
INSERT INTO #payments (pay_ID, pay_type, pay_account, pay_routing, pay_date) VALUES
(@payID, 1, 'e121', '0101', '09/18/2020'),
(@payID, 1, 'e121', '0101', '09/19/2020'),
(@payID, 1, 'e121', '0101', '09/20/2020'),
(@payID, 2, 'e122', '0102', '09/21/2020'),
(@payID, 2, 'e122', '0102', '09/22/2020'),
(@payID, 1, 'e121', '0101', '09/23/2020'),
(@payID, 1, 'e121', '0101', '09/24/2020'),
(@payID, 1, 'e121', '0101', '09/25/2020'),
(@payID, 2, 'e122', '0102', '09/26/2020'),
(@payID, 2, 'e122', '0102', '09/27/2020'),
(@payID, 3, 'e123', '0103', '09/28/2020'),
(@payID, 1, 'e121', '0101', '09/29/2020'),
(@payID, 1, 'e121', '0101', '09/30/2020'),
(@payID, 1, 'e121', '0101', '10/01/2020'),
(@payID, 1, 'e121', '0101', '10/02/2020')
SELECT *
FROM #payments
ORDER BY pay_ID ASC, pay_date ASC;
此代码可用于为每次更改的付款创建一组,但我不确定如何使用此代码获取开始和结束日期。
SELECT
p.*
FROM
(SELECT
p.*,
LAG(pay_date) OVER (PARTITION BY pay_id, ORDER BY pay_date) AS prev_pd,
LAG(pay_date) OVER (PARTITION BY pay_id, pay_account, pay_type, pay_routing ORDER BY pay_date) AS prev_pd_grp
FROM
#payments p) p
WHERE
prev_pd_grp IS NULL OR prev_pd_grp <> prev_pd
期望的结果是,第一笔付款在付款更改的每个部分都有一个开始日期和结束日期的戳记。
ID PayType account routing CreatedDate start end
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-18 00:00:00.000 2020-09-18 00:00:00.000 2020-09-20 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-19 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-20 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-21 00:00:00.000 2020-09-21 00:00:00.000 2020-09-22 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-22 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-23 00:00:00.000 2020-09-23 00:00:00.000 2020-09-25 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-24 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-25 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-26 00:00:00.000 2020-09-26 00:00:00.000 2020-09-27 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-27 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 3 e123 0103 2020-09-28 00:00:00.000 2020-09-28 00:00:00.000 2020-09-28 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-29 00:00:00.000 2020-09-29 00:00:00.000 2020-10-02 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-30 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-10-01 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-10-02 00:00:00.000 NULL NULL
这是一个 gaps-and-island 问题。这是一种使用行号之间的差异来识别组的方法。然后,您可以在外部查询中再次使用 row_number()
来标识每组的第一条记录,并使用 window min()
和 max()
来展示相应的日期范围:
select pay_id, pay_type, pay_account, pay_routing, pay_date,
case when row_number() over(partition by pay_id, pay_type, rn1 - rn2 order by pay_date) = 1
then min(pay_date) over(partition by pay_id, pay_type, rn1 - rn2)
end as pay_date_start,
case when row_number() over(partition by pay_id, pay_type, rn1 - rn2 order by pay_date) = 1
then max(pay_date) over(partition by pay_id, pay_type, rn1 - rn2)
end as pay_date_end
from (
select p.*,
row_number() over(partition by pay_id order by pay_date) rn1,
row_number() over(partition by pay_id, pay_type order by pay_date) rn2
from #payments p
) p
order by pay_id, pay_date
pay_id | pay_type | pay_account | pay_routing | pay_date | pay_date_start | pay_date_end
:----------------------------------- | -------: | :---------- | :---------- | :---------------------- | :---------------------- | :----------------------
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-18 00:00:00.000 | 2020-09-18 00:00:00.000 | 2020-09-20 00:00:00.000
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-19 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-20 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-21 00:00:00.000 | 2020-09-21 00:00:00.000 | 2020-09-22 00:00:00.000
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-22 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-23 00:00:00.000 | 2020-09-23 00:00:00.000 | 2020-09-25 00:00:00.000
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-24 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-25 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-26 00:00:00.000 | 2020-09-26 00:00:00.000 | 2020-09-27 00:00:00.000
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-27 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 3 | e123 | 0103 | 2020-09-28 00:00:00.000 | 2020-09-28 00:00:00.000 | 2020-09-28 00:00:00.000
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-29 00:00:00.000 | 2020-09-29 00:00:00.000 | 2020-10-02 00:00:00.000
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-30 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-10-01 00:00:00.000 | null | null
2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-10-02 00:00:00.000 | null | null
我有 table 存储付款更改记录。 因此,每次更改支付方式时,都会存储使用的支付方式和日期。数据是批量来的,但我只抓取了新付款使用的第一个日期。
CREATE TABLE #payments
(
pay_ID uniqueidentifier,
pay_type int,
pay_account varchar(max),
pay_routing varchar(max),
pay_date datetime
);
DECLARE @payID uniqueidentifier = newid();
--Actual payments made
INSERT INTO #payments (pay_ID, pay_type, pay_account, pay_routing, pay_date) VALUES
(@payID, 1, 'e121', '0101', '09/18/2020'),
(@payID, 1, 'e121', '0101', '09/19/2020'),
(@payID, 1, 'e121', '0101', '09/20/2020'),
(@payID, 2, 'e122', '0102', '09/21/2020'),
(@payID, 2, 'e122', '0102', '09/22/2020'),
(@payID, 1, 'e121', '0101', '09/23/2020'),
(@payID, 1, 'e121', '0101', '09/24/2020'),
(@payID, 1, 'e121', '0101', '09/25/2020'),
(@payID, 2, 'e122', '0102', '09/26/2020'),
(@payID, 2, 'e122', '0102', '09/27/2020'),
(@payID, 3, 'e123', '0103', '09/28/2020'),
(@payID, 1, 'e121', '0101', '09/29/2020'),
(@payID, 1, 'e121', '0101', '09/30/2020'),
(@payID, 1, 'e121', '0101', '10/01/2020'),
(@payID, 1, 'e121', '0101', '10/02/2020')
SELECT *
FROM #payments
ORDER BY pay_ID ASC, pay_date ASC;
此代码可用于为每次更改的付款创建一组,但我不确定如何使用此代码获取开始和结束日期。
SELECT
p.*
FROM
(SELECT
p.*,
LAG(pay_date) OVER (PARTITION BY pay_id, ORDER BY pay_date) AS prev_pd,
LAG(pay_date) OVER (PARTITION BY pay_id, pay_account, pay_type, pay_routing ORDER BY pay_date) AS prev_pd_grp
FROM
#payments p) p
WHERE
prev_pd_grp IS NULL OR prev_pd_grp <> prev_pd
期望的结果是,第一笔付款在付款更改的每个部分都有一个开始日期和结束日期的戳记。
ID PayType account routing CreatedDate start end
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-18 00:00:00.000 2020-09-18 00:00:00.000 2020-09-20 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-19 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-20 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-21 00:00:00.000 2020-09-21 00:00:00.000 2020-09-22 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-22 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-23 00:00:00.000 2020-09-23 00:00:00.000 2020-09-25 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-24 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-25 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-26 00:00:00.000 2020-09-26 00:00:00.000 2020-09-27 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 2 e122 0102 2020-09-27 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 3 e123 0103 2020-09-28 00:00:00.000 2020-09-28 00:00:00.000 2020-09-28 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-29 00:00:00.000 2020-09-29 00:00:00.000 2020-10-02 00:00:00.000
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-09-30 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-10-01 00:00:00.000 NULL NULL
FB4FE2A7-3609-4E35-AFB9-908B2D3072E9 1 e121 0101 2020-10-02 00:00:00.000 NULL NULL
这是一个 gaps-and-island 问题。这是一种使用行号之间的差异来识别组的方法。然后,您可以在外部查询中再次使用 row_number()
来标识每组的第一条记录,并使用 window min()
和 max()
来展示相应的日期范围:
select pay_id, pay_type, pay_account, pay_routing, pay_date,
case when row_number() over(partition by pay_id, pay_type, rn1 - rn2 order by pay_date) = 1
then min(pay_date) over(partition by pay_id, pay_type, rn1 - rn2)
end as pay_date_start,
case when row_number() over(partition by pay_id, pay_type, rn1 - rn2 order by pay_date) = 1
then max(pay_date) over(partition by pay_id, pay_type, rn1 - rn2)
end as pay_date_end
from (
select p.*,
row_number() over(partition by pay_id order by pay_date) rn1,
row_number() over(partition by pay_id, pay_type order by pay_date) rn2
from #payments p
) p
order by pay_id, pay_date
pay_id | pay_type | pay_account | pay_routing | pay_date | pay_date_start | pay_date_end :----------------------------------- | -------: | :---------- | :---------- | :---------------------- | :---------------------- | :---------------------- 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-18 00:00:00.000 | 2020-09-18 00:00:00.000 | 2020-09-20 00:00:00.000 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-19 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-20 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-21 00:00:00.000 | 2020-09-21 00:00:00.000 | 2020-09-22 00:00:00.000 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-22 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-23 00:00:00.000 | 2020-09-23 00:00:00.000 | 2020-09-25 00:00:00.000 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-24 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-25 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-26 00:00:00.000 | 2020-09-26 00:00:00.000 | 2020-09-27 00:00:00.000 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 2 | e122 | 0102 | 2020-09-27 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 3 | e123 | 0103 | 2020-09-28 00:00:00.000 | 2020-09-28 00:00:00.000 | 2020-09-28 00:00:00.000 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-29 00:00:00.000 | 2020-09-29 00:00:00.000 | 2020-10-02 00:00:00.000 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-09-30 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-10-01 00:00:00.000 | null | null 2c1a463f-198b-41bd-a1a4-30aafda21d4f | 1 | e121 | 0101 | 2020-10-02 00:00:00.000 | null | null