根据优先级填补空白
Fill gaps with set based on priority
目标
对于每个 Foo
,我们应该尝试用 FooBar
中的记录尽可能多地填充时间范围。当 FooBar
中不存在记录时,可以有空的时间范围。
FooBar
中的记录表现为一个集合。
这意味着如果 FooId
和时间范围(完全)相同,则 BarId 在此时间范围内均有效。
应根据集合优先级填补空白。
Table结构
CREATE TABLE Foo(
FooId INT NOT NULL,
ValidFrom DATETIME NOT NULL,
ValidUntil DATETIME NOT NULL,
)
CREATE TABLE FooBar (
FooId INT NOT NULL,
BarId INT NOT NULL,
ValidFrom DATETIME NOT NULL,
ValidUntil DATETIME NOT NULL,
Priority TINYINT NOT NULL,
)
示例数据
INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES
(1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')
INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority)
VALUES
-- First set for FooId = 1 with prio 1
(1, 1, '2021-01-01', '2021-03-01', 1)
, (1, 2, '2021-01-01', '2021-03-01', 1)
, (1, 3, '2021-01-01', '2021-03-01', 1)
-- Second set for FooId = 1 with prio 2
, (1, 1, '2021-02-01', '2021-06-01', 2)
, (1, 2, '2021-02-01', '2021-06-01', 2)
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3)
, (1, 2, '2021-01-01', '2021-12-31', 3)
, (1, 3, '2021-01-01', '2021-12-31', 3)
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1)
, (1, 5, '2021-04-01', '2021-04-02', 1)
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3)
预期结果
要澄清的原始列,不应该是生成的结果集的一部分
FooId
BarId
ValidFrom
ValidUntil
Origin
1
1
2021-01-01
2021-03-01
First set
1
2
2021-01-01
2021-03-01
First set
1
3
2021-01-01
2021-03-01
First set
1
1
2021-03-02
2021-03-31
Second set
1
2
2021-03-02
2021-03-31
Second set
1
4
2021-04-01
2021-04-02
Fourth set
1
5
2021-04-01
2021-04-02
Fourth set
1
1
2021-04-03
2021-06-01
Second set
1
2
2021-04-03
2021-06-01
Second set
1
1
2021-06-02
2021-12-31
Third set
1
2
2021-06-02
2021-12-31
Third set
1
3
2021-06-02
2021-12-31
Third set
2
6
2021-01-01
2021-12-31
First set (FooId = 2)
我知道这可以通过游标或 while 循环实现,但我正在寻找更 performant/elegant 的解决方案。
兼容级别为:130
It's ok to have empty time frames when no records exists in FooBar.
这是否意味着没有空帧的解决方案也可以接受table?
如果是这样,那么 FooId = 1
的第三组也定义了一个 BarId = 3
用于周期 2021-03-02 -> 2021-03-31
例如。
示例数据
稍微调整了数据模型,结果中没有这些时间戳 (00:00:00.000
)。
还添加了一个集合标识符 (FooBar.SetId
) 以便于溯源。
CREATE TABLE Foo(
FooId INT NOT NULL,
ValidFrom DATE/*TIME*/ NOT NULL,
ValidUntil DATE/*TIME*/ NOT NULL
)
CREATE TABLE FooBar (
FooId INT NOT NULL,
BarId INT NOT NULL,
ValidFrom DATE/*TIME*/ NOT NULL,
ValidUntil DATE/*TIME*/ NOT NULL,
Priority TINYINT NOT NULL,
SetId nvarchar(5)
)
INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES
(1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')
INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority, SetId)
VALUES
-- First set for FooId = 1 with prio 1
(1, 1, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 2, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 3, '2021-01-01', '2021-03-01', 1, 'Set 1')
-- Second set for FooId = 1 with prio 2
, (1, 1, '2021-02-01', '2021-06-01', 2, 'Set 2')
, (1, 2, '2021-02-01', '2021-06-01', 2, 'Set 2')
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 2, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 3, '2021-01-01', '2021-12-31', 3, 'Set 3')
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1, 'Set 4')
, (1, 5, '2021-04-01', '2021-04-02', 1, 'Set 4')
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3, 'Set 1')
解决方案
- 常见的 table 表达式 (CTE)
ValidFrom
和 ValidPeriod
将 Foo
和 FooBar
中的所有周期信息剪切到最小的单个周期中。
- 上一步还为使用
exists
子句删除的每个 FooId
生成了一个额外的尾随不完整句点。
- 然后为每个单独的周期获取具有第一个优先级值的
FooBar
记录(也就是说不允许具有更小优先级的类似记录:not exists ... fb2.Priority < fb.Priority
)。
这给出:
with ValidFrom as
(
select f.FooId,
f.ValidFrom
from Foo f
union
select f.FooId,
dateadd(day, 1, f.ValidUntil)
from Foo f
union
select fb.FooId,
fb.ValidFrom
from FooBar fb
union
select fb.FooId,
dateadd(day, 1, fb.ValidUntil)
from Foobar fb
),
ValidPeriod as
(
select vf.FooId,
vf.ValidFrom,
dateadd(day, -1, lead(vf.ValidFrom) over(partition by vf.FooId order by vf.ValidFrom)) as ValidUntil
from ValidFrom vf
)
select vp.FooId,
fb.BarId,
vp.ValidFrom,
vp.ValidUntil,
--fb.ValidFrom,
--fb.ValidUntil,
--fb.Priority,
fb.SetId
from ValidPeriod vp
left join FooBar fb
on fb.FooId = vp.FooId
and fb.ValidFrom <= vp.ValidUntil
and fb.ValidUntil >= vp.ValidFrom
and not exists ( select 'x'
from FooBar fb2
where fb2.FooId = fb.FooId
and fb2.BarId = fb.BarId
and fb2.ValidFrom <= vp.ValidUntil
and fb2.ValidUntil >= vp.ValidFrom
and fb2.Priority < fb.Priority )
where exists ( select 'x'
from ValidPeriod vp2
where vp2.FooId = vp.FooId
and vp2.ValidFrom > vp.ValidFrom )
order by vp.FooId,
vp.ValidFrom,
fb.BarId;
结果
此结果包含的经期信息多于您在预期结果中请求的信息。从第一个 CTE 中删除带有 Foo
的 union
将删除 null
值并将周期限制为仅在 FooBar
中可用的周期信息(实际上这会消除Foo
完全来自解决方案)。
以 vp.ValidFrom
和 vp.ValidUntil
作为结果周期:
FooId BarId ValidFrom ValidUntil SetId
----- ----- ---------- ---------- -----
1 null 2020-01-01 2020-12-31 null -- extra row
1 1 2021-01-01 2021-01-31 Set 1
1 2 2021-01-01 2021-01-31 Set 1
1 3 2021-01-01 2021-01-31 Set 1
1 1 2021-02-01 2021-03-01 Set 1 -- extra row
1 2 2021-02-01 2021-03-01 Set 1 -- extra row
1 3 2021-02-01 2021-03-01 Set 1 -- extra row
1 1 2021-03-02 2021-03-31 Set 2
1 2 2021-03-02 2021-03-31 Set 2
1 3 2021-03-02 2021-03-31 Set 3 -- extra row
1 1 2021-04-01 2021-04-02 Set 2 -- extra row
1 2 2021-04-01 2021-04-02 Set 2 -- extra row
1 3 2021-04-01 2021-04-02 Set 3 -- extra row
1 4 2021-04-01 2021-04-02 Set 4
1 5 2021-04-01 2021-04-02 Set 4
1 1 2021-04-03 2021-06-01 Set 2
1 2 2021-04-03 2021-06-01 Set 2
1 3 2021-04-03 2021-06-01 Set 3 -- extra row
1 1 2021-06-02 2021-12-31 Set 3
1 2 2021-06-02 2021-12-31 Set 3
1 3 2021-06-02 2021-12-31 Set 3
2 null 2020-01-01 2020-12-31 null -- extra row
2 6 2021-01-01 2021-04-02 Set 1
2 null 2021-04-03 2021-06-30 null -- extra row
以 fb.ValidFrom
和 fb.ValidUntil
作为结果周期:
FooId BarId ValidFrom ValidUntil SetId
----- ----- ---------- ---------- -----
1 null null null null -- extra row
1 1 2021-01-01 2021-03-01 Set 1
1 2 2021-01-01 2021-03-01 Set 1
1 3 2021-01-01 2021-03-01 Set 1
1 1 2021-01-01 2021-03-01 Set 1 -- extra row
1 2 2021-01-01 2021-03-01 Set 1 -- extra row
1 3 2021-01-01 2021-03-01 Set 1 -- extra row
1 1 2021-02-01 2021-06-01 Set 2
1 2 2021-02-01 2021-06-01 Set 2
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 1 2021-02-01 2021-06-01 Set 2 -- extra row
1 2 2021-02-01 2021-06-01 Set 2 -- extra row
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 4 2021-04-01 2021-04-02 Set 4
1 5 2021-04-01 2021-04-02 Set 4
1 1 2021-02-01 2021-06-01 Set 2
1 2 2021-02-01 2021-06-01 Set 2
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 1 2021-01-01 2021-12-31 Set 3
1 2 2021-01-01 2021-12-31 Set 3
1 3 2021-01-01 2021-12-31 Set 3
2 null null null null -- extra row
2 6 2021-01-01 2021-04-02 Set 1
2 null null null null -- extra row
Fiddle 查看实际情况。
目标
对于每个 Foo
,我们应该尝试用 FooBar
中的记录尽可能多地填充时间范围。当 FooBar
中不存在记录时,可以有空的时间范围。
FooBar
中的记录表现为一个集合。
这意味着如果 FooId
和时间范围(完全)相同,则 BarId 在此时间范围内均有效。
应根据集合优先级填补空白。
Table结构
CREATE TABLE Foo(
FooId INT NOT NULL,
ValidFrom DATETIME NOT NULL,
ValidUntil DATETIME NOT NULL,
)
CREATE TABLE FooBar (
FooId INT NOT NULL,
BarId INT NOT NULL,
ValidFrom DATETIME NOT NULL,
ValidUntil DATETIME NOT NULL,
Priority TINYINT NOT NULL,
)
示例数据
INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES
(1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')
INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority)
VALUES
-- First set for FooId = 1 with prio 1
(1, 1, '2021-01-01', '2021-03-01', 1)
, (1, 2, '2021-01-01', '2021-03-01', 1)
, (1, 3, '2021-01-01', '2021-03-01', 1)
-- Second set for FooId = 1 with prio 2
, (1, 1, '2021-02-01', '2021-06-01', 2)
, (1, 2, '2021-02-01', '2021-06-01', 2)
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3)
, (1, 2, '2021-01-01', '2021-12-31', 3)
, (1, 3, '2021-01-01', '2021-12-31', 3)
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1)
, (1, 5, '2021-04-01', '2021-04-02', 1)
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3)
预期结果 要澄清的原始列,不应该是生成的结果集的一部分
FooId | BarId | ValidFrom | ValidUntil | Origin |
---|---|---|---|---|
1 | 1 | 2021-01-01 | 2021-03-01 | First set |
1 | 2 | 2021-01-01 | 2021-03-01 | First set |
1 | 3 | 2021-01-01 | 2021-03-01 | First set |
1 | 1 | 2021-03-02 | 2021-03-31 | Second set |
1 | 2 | 2021-03-02 | 2021-03-31 | Second set |
1 | 4 | 2021-04-01 | 2021-04-02 | Fourth set |
1 | 5 | 2021-04-01 | 2021-04-02 | Fourth set |
1 | 1 | 2021-04-03 | 2021-06-01 | Second set |
1 | 2 | 2021-04-03 | 2021-06-01 | Second set |
1 | 1 | 2021-06-02 | 2021-12-31 | Third set |
1 | 2 | 2021-06-02 | 2021-12-31 | Third set |
1 | 3 | 2021-06-02 | 2021-12-31 | Third set |
2 | 6 | 2021-01-01 | 2021-12-31 | First set (FooId = 2) |
我知道这可以通过游标或 while 循环实现,但我正在寻找更 performant/elegant 的解决方案。
兼容级别为:130
It's ok to have empty time frames when no records exists in FooBar.
这是否意味着没有空帧的解决方案也可以接受table?
如果是这样,那么 FooId = 1
的第三组也定义了一个 BarId = 3
用于周期 2021-03-02 -> 2021-03-31
例如。
示例数据
稍微调整了数据模型,结果中没有这些时间戳 (00:00:00.000
)。
还添加了一个集合标识符 (FooBar.SetId
) 以便于溯源。
CREATE TABLE Foo(
FooId INT NOT NULL,
ValidFrom DATE/*TIME*/ NOT NULL,
ValidUntil DATE/*TIME*/ NOT NULL
)
CREATE TABLE FooBar (
FooId INT NOT NULL,
BarId INT NOT NULL,
ValidFrom DATE/*TIME*/ NOT NULL,
ValidUntil DATE/*TIME*/ NOT NULL,
Priority TINYINT NOT NULL,
SetId nvarchar(5)
)
INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES
(1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')
INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority, SetId)
VALUES
-- First set for FooId = 1 with prio 1
(1, 1, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 2, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 3, '2021-01-01', '2021-03-01', 1, 'Set 1')
-- Second set for FooId = 1 with prio 2
, (1, 1, '2021-02-01', '2021-06-01', 2, 'Set 2')
, (1, 2, '2021-02-01', '2021-06-01', 2, 'Set 2')
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 2, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 3, '2021-01-01', '2021-12-31', 3, 'Set 3')
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1, 'Set 4')
, (1, 5, '2021-04-01', '2021-04-02', 1, 'Set 4')
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3, 'Set 1')
解决方案
- 常见的 table 表达式 (CTE)
ValidFrom
和ValidPeriod
将Foo
和FooBar
中的所有周期信息剪切到最小的单个周期中。 - 上一步还为使用
exists
子句删除的每个FooId
生成了一个额外的尾随不完整句点。 - 然后为每个单独的周期获取具有第一个优先级值的
FooBar
记录(也就是说不允许具有更小优先级的类似记录:not exists ... fb2.Priority < fb.Priority
)。
这给出:
with ValidFrom as
(
select f.FooId,
f.ValidFrom
from Foo f
union
select f.FooId,
dateadd(day, 1, f.ValidUntil)
from Foo f
union
select fb.FooId,
fb.ValidFrom
from FooBar fb
union
select fb.FooId,
dateadd(day, 1, fb.ValidUntil)
from Foobar fb
),
ValidPeriod as
(
select vf.FooId,
vf.ValidFrom,
dateadd(day, -1, lead(vf.ValidFrom) over(partition by vf.FooId order by vf.ValidFrom)) as ValidUntil
from ValidFrom vf
)
select vp.FooId,
fb.BarId,
vp.ValidFrom,
vp.ValidUntil,
--fb.ValidFrom,
--fb.ValidUntil,
--fb.Priority,
fb.SetId
from ValidPeriod vp
left join FooBar fb
on fb.FooId = vp.FooId
and fb.ValidFrom <= vp.ValidUntil
and fb.ValidUntil >= vp.ValidFrom
and not exists ( select 'x'
from FooBar fb2
where fb2.FooId = fb.FooId
and fb2.BarId = fb.BarId
and fb2.ValidFrom <= vp.ValidUntil
and fb2.ValidUntil >= vp.ValidFrom
and fb2.Priority < fb.Priority )
where exists ( select 'x'
from ValidPeriod vp2
where vp2.FooId = vp.FooId
and vp2.ValidFrom > vp.ValidFrom )
order by vp.FooId,
vp.ValidFrom,
fb.BarId;
结果
此结果包含的经期信息多于您在预期结果中请求的信息。从第一个 CTE 中删除带有 Foo
的 union
将删除 null
值并将周期限制为仅在 FooBar
中可用的周期信息(实际上这会消除Foo
完全来自解决方案)。
以 vp.ValidFrom
和 vp.ValidUntil
作为结果周期:
FooId BarId ValidFrom ValidUntil SetId
----- ----- ---------- ---------- -----
1 null 2020-01-01 2020-12-31 null -- extra row
1 1 2021-01-01 2021-01-31 Set 1
1 2 2021-01-01 2021-01-31 Set 1
1 3 2021-01-01 2021-01-31 Set 1
1 1 2021-02-01 2021-03-01 Set 1 -- extra row
1 2 2021-02-01 2021-03-01 Set 1 -- extra row
1 3 2021-02-01 2021-03-01 Set 1 -- extra row
1 1 2021-03-02 2021-03-31 Set 2
1 2 2021-03-02 2021-03-31 Set 2
1 3 2021-03-02 2021-03-31 Set 3 -- extra row
1 1 2021-04-01 2021-04-02 Set 2 -- extra row
1 2 2021-04-01 2021-04-02 Set 2 -- extra row
1 3 2021-04-01 2021-04-02 Set 3 -- extra row
1 4 2021-04-01 2021-04-02 Set 4
1 5 2021-04-01 2021-04-02 Set 4
1 1 2021-04-03 2021-06-01 Set 2
1 2 2021-04-03 2021-06-01 Set 2
1 3 2021-04-03 2021-06-01 Set 3 -- extra row
1 1 2021-06-02 2021-12-31 Set 3
1 2 2021-06-02 2021-12-31 Set 3
1 3 2021-06-02 2021-12-31 Set 3
2 null 2020-01-01 2020-12-31 null -- extra row
2 6 2021-01-01 2021-04-02 Set 1
2 null 2021-04-03 2021-06-30 null -- extra row
以 fb.ValidFrom
和 fb.ValidUntil
作为结果周期:
FooId BarId ValidFrom ValidUntil SetId
----- ----- ---------- ---------- -----
1 null null null null -- extra row
1 1 2021-01-01 2021-03-01 Set 1
1 2 2021-01-01 2021-03-01 Set 1
1 3 2021-01-01 2021-03-01 Set 1
1 1 2021-01-01 2021-03-01 Set 1 -- extra row
1 2 2021-01-01 2021-03-01 Set 1 -- extra row
1 3 2021-01-01 2021-03-01 Set 1 -- extra row
1 1 2021-02-01 2021-06-01 Set 2
1 2 2021-02-01 2021-06-01 Set 2
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 1 2021-02-01 2021-06-01 Set 2 -- extra row
1 2 2021-02-01 2021-06-01 Set 2 -- extra row
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 4 2021-04-01 2021-04-02 Set 4
1 5 2021-04-01 2021-04-02 Set 4
1 1 2021-02-01 2021-06-01 Set 2
1 2 2021-02-01 2021-06-01 Set 2
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 1 2021-01-01 2021-12-31 Set 3
1 2 2021-01-01 2021-12-31 Set 3
1 3 2021-01-01 2021-12-31 Set 3
2 null null null null -- extra row
2 6 2021-01-01 2021-04-02 Set 1
2 null null null null -- extra row
Fiddle 查看实际情况。