根据中断条件将多行折叠成一行
Collapse multiple rows into a single row based upon a break condition
我有一个听起来很简单的要求,现在已经难住了我一天左右,所以是时候寻求专家的帮助了。
我的要求是根据中断条件将多行简单地汇总成一行 - 当这些列中的任何一个更改员工 ID、津贴计划、津贴金额或截止日期时,该行将被保留,如果有道理的话。
示例源数据集如下所示:
折叠行后的目标数据应如下所示:
如您所见,我不需要任何类型的 运行 总计计算,我只需要将行折叠成每个来自 date/to 日期组合的记录。
到目前为止,我已经使用 GROUP BY 和 MIN 函数尝试了以下 SQL
select [Employee ID], [Allowance Plan],
min([From Date]), max([To Date]), [Allowance Amount]
from [dbo].[#AllowInfo]
group by [Employee ID], [Allowance Plan], [Allowance Amount]
但这只给了我一行,没有考虑中断条件。
我需要做什么才能正确地汇总记录(如果这不是正确的术语,请纠正我)考虑到中断条件?
感谢任何帮助。
谢谢。
请注意,您的测试数据并没有很好地运用算法 - 例如您只有一名员工,一个计划。另外,正如您所描述的,您最终会得到 4 行,因为日期在 7->8、8->9、9->10 和 10->11 之间发生了变化。
但我可以看到您正在尝试做什么,所以这至少应该让您走上正轨,并且 returns 预期的 3 行。我将组的末尾设为 employee/plan/amount 已更改的位置,或者 todate 不为空的位置(或者我们到达数据末尾的位置)
CREATE TABLE #data
(
RowID INT,
EmployeeID INT,
AllowancePlan VARCHAR(30),
FromDate DATE,
ToDate DATE,
AllowanceAmount DECIMAL(12,2)
);
INSERT INTO #data(RowID, EmployeeID, AllowancePlan, FromDate, ToDate, AllowanceAmount)
VALUES
(1,200690,'CarAllowance','30/03/2017', NULL, 1000.0),
(2,200690,'CarAllowance','01/08/2017', NULL, 1000.0),
(6,200690,'CarAllowance','23/04/2018', NULL, 1000.0),
(7,200690,'CarAllowance','30/03/2018', NULL, 1000.0),
(8,200690,'CarAllowance','21/06/2018', '01/04/2019', 1000.0),
(9,200690,'CarAllowance','04/11/2021', NULL, 1000.0),
(10,200690,'CarAllowance','30/03/2017', '13/05/2022', 1000.0),
(11,200690,'CarAllowance','14/05/2022', NULL, 850.0);
-- find where the break points are
WITH chg AS
(
SELECT *,
CASE WHEN LAG(EmployeeID, 1, -1) OVER(ORDER BY RowID) != EmployeeID
OR LAG(AllowancePlan, 1, 'X') OVER(ORDER BY RowID) != AllowancePlan
OR LAG(AllowanceAmount, 1, -1) OVER(ORDER BY RowID) != AllowanceAmount
OR LAG(ToDate, 1) OVER(ORDER BY RowID) IS NOT NULL
THEN 1 ELSE 0 END AS NewGroup
FROM #data
),
-- count the number of break points as we go to group the related rows
grp AS
(
SELECT chg.*,
ISNULL(
SUM(NewGroup)
OVER (ORDER BY RowID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
0) AS grpNum
FROM chg
)
SELECT MIN(grp.RowID) AS RowID,
MAX(grp.EmployeeID) AS EmployeeID,
MAX(grp.AllowancePlan) AS AllowancePlan,
MIN(grp.FromDate) AS FromDate,
MAX(grp.ToDate) AS ToDate,
MAX(grp.AllowanceAmount) AS AllowanceAmount
FROM grp
GROUP BY grpNum
一种方法是获取所有行的最后日期,然后对其进行分组
select min(t.RowID) as RowID,
t.EmployeeID,
min(t.AllowancePlan) as AllowancePlan,
min(t.FromDate) as FromDate,
max(t.ToDate) as ToDate,
min(t.AllowanceAmount) as AllowanceAmount
from ( select t.RowID,
t.EmployeeID,
t.FromDate,
t.AllowancePlan,
t.AllowanceAmount,
case when t.ToDate is null then ( select top 1 t2.ToDate
from test t2
where t2.EmployeeID = t.EmployeeID
and t2.ToDate is not null
and t2.FromDate > t.FromDate -- t2.RowID > t.RowID
order by t2.RowID, t2.FromDate
)
else t.ToDate
end as todate
from test t
) t
group by t.EmployeeID, t.ToDate
order by t.EmployeeID, min(t.RowID)
自己看和测试in this DBFiddle
结果是
RowID
EmployeeID
AllowancePlan
FromDate
ToDate
AllowanceAmount
1
200690
CarAllowance
2017-03-30
2019-04-01
1000
9
200690
CarAllowance
2021-11-04
2022-05-13
1000
11
200690
CarAllowance
2022-05-14
(null)
850
我有一个听起来很简单的要求,现在已经难住了我一天左右,所以是时候寻求专家的帮助了。
我的要求是根据中断条件将多行简单地汇总成一行 - 当这些列中的任何一个更改员工 ID、津贴计划、津贴金额或截止日期时,该行将被保留,如果有道理的话。
示例源数据集如下所示:
折叠行后的目标数据应如下所示:
如您所见,我不需要任何类型的 运行 总计计算,我只需要将行折叠成每个来自 date/to 日期组合的记录。
到目前为止,我已经使用 GROUP BY 和 MIN 函数尝试了以下 SQL
select [Employee ID], [Allowance Plan],
min([From Date]), max([To Date]), [Allowance Amount]
from [dbo].[#AllowInfo]
group by [Employee ID], [Allowance Plan], [Allowance Amount]
但这只给了我一行,没有考虑中断条件。
我需要做什么才能正确地汇总记录(如果这不是正确的术语,请纠正我)考虑到中断条件?
感谢任何帮助。
谢谢。
请注意,您的测试数据并没有很好地运用算法 - 例如您只有一名员工,一个计划。另外,正如您所描述的,您最终会得到 4 行,因为日期在 7->8、8->9、9->10 和 10->11 之间发生了变化。
但我可以看到您正在尝试做什么,所以这至少应该让您走上正轨,并且 returns 预期的 3 行。我将组的末尾设为 employee/plan/amount 已更改的位置,或者 todate 不为空的位置(或者我们到达数据末尾的位置)
CREATE TABLE #data
(
RowID INT,
EmployeeID INT,
AllowancePlan VARCHAR(30),
FromDate DATE,
ToDate DATE,
AllowanceAmount DECIMAL(12,2)
);
INSERT INTO #data(RowID, EmployeeID, AllowancePlan, FromDate, ToDate, AllowanceAmount)
VALUES
(1,200690,'CarAllowance','30/03/2017', NULL, 1000.0),
(2,200690,'CarAllowance','01/08/2017', NULL, 1000.0),
(6,200690,'CarAllowance','23/04/2018', NULL, 1000.0),
(7,200690,'CarAllowance','30/03/2018', NULL, 1000.0),
(8,200690,'CarAllowance','21/06/2018', '01/04/2019', 1000.0),
(9,200690,'CarAllowance','04/11/2021', NULL, 1000.0),
(10,200690,'CarAllowance','30/03/2017', '13/05/2022', 1000.0),
(11,200690,'CarAllowance','14/05/2022', NULL, 850.0);
-- find where the break points are
WITH chg AS
(
SELECT *,
CASE WHEN LAG(EmployeeID, 1, -1) OVER(ORDER BY RowID) != EmployeeID
OR LAG(AllowancePlan, 1, 'X') OVER(ORDER BY RowID) != AllowancePlan
OR LAG(AllowanceAmount, 1, -1) OVER(ORDER BY RowID) != AllowanceAmount
OR LAG(ToDate, 1) OVER(ORDER BY RowID) IS NOT NULL
THEN 1 ELSE 0 END AS NewGroup
FROM #data
),
-- count the number of break points as we go to group the related rows
grp AS
(
SELECT chg.*,
ISNULL(
SUM(NewGroup)
OVER (ORDER BY RowID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
0) AS grpNum
FROM chg
)
SELECT MIN(grp.RowID) AS RowID,
MAX(grp.EmployeeID) AS EmployeeID,
MAX(grp.AllowancePlan) AS AllowancePlan,
MIN(grp.FromDate) AS FromDate,
MAX(grp.ToDate) AS ToDate,
MAX(grp.AllowanceAmount) AS AllowanceAmount
FROM grp
GROUP BY grpNum
一种方法是获取所有行的最后日期,然后对其进行分组
select min(t.RowID) as RowID,
t.EmployeeID,
min(t.AllowancePlan) as AllowancePlan,
min(t.FromDate) as FromDate,
max(t.ToDate) as ToDate,
min(t.AllowanceAmount) as AllowanceAmount
from ( select t.RowID,
t.EmployeeID,
t.FromDate,
t.AllowancePlan,
t.AllowanceAmount,
case when t.ToDate is null then ( select top 1 t2.ToDate
from test t2
where t2.EmployeeID = t.EmployeeID
and t2.ToDate is not null
and t2.FromDate > t.FromDate -- t2.RowID > t.RowID
order by t2.RowID, t2.FromDate
)
else t.ToDate
end as todate
from test t
) t
group by t.EmployeeID, t.ToDate
order by t.EmployeeID, min(t.RowID)
自己看和测试in this DBFiddle
结果是
RowID | EmployeeID | AllowancePlan | FromDate | ToDate | AllowanceAmount |
---|---|---|---|---|---|
1 | 200690 | CarAllowance | 2017-03-30 | 2019-04-01 | 1000 |
9 | 200690 | CarAllowance | 2021-11-04 | 2022-05-13 | 1000 |
11 | 200690 | CarAllowance | 2022-05-14 | (null) | 850 |