根据优先级填补空白

Fill gaps with set based on priority

目标

对于每个 Foo,我们应该尝试用 FooBar 中的记录尽可能多地填充时间范围。当 FooBar 中不存在记录时,可以有空的时间范围。 FooBar 中的记录表现为一个集合。 这意味着如果 FooId 和时间范围(完全)相同,则 BarId 在此时间范围内均有效。 应根据集合优先级填补空白。

Table结构

CREATE TABLE Foo(
    FooId INT NOT NULL,
    ValidFrom DATETIME NOT NULL,
    ValidUntil DATETIME NOT NULL,
)

CREATE TABLE FooBar (
    FooId INT NOT NULL,
    BarId INT NOT NULL,
    ValidFrom DATETIME NOT NULL,
    ValidUntil DATETIME NOT NULL,
    Priority TINYINT NOT NULL,
)

示例数据

INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES 
  (1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')

INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority)
VALUES 
-- First set for FooId = 1 with prio 1
  (1, 1, '2021-01-01', '2021-03-01', 1)
, (1, 2, '2021-01-01', '2021-03-01', 1)
, (1, 3, '2021-01-01', '2021-03-01', 1)
-- Second set for FooId = 1 with prio 2 
, (1, 1, '2021-02-01', '2021-06-01', 2)
, (1, 2, '2021-02-01', '2021-06-01', 2)
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3)
, (1, 2, '2021-01-01', '2021-12-31', 3)
, (1, 3, '2021-01-01', '2021-12-31', 3)
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1)
, (1, 5, '2021-04-01', '2021-04-02', 1)
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3)

预期结果 要澄清的原始列,不应该是生成的结果集的一部分

FooId BarId ValidFrom ValidUntil Origin
1 1 2021-01-01 2021-03-01 First set
1 2 2021-01-01 2021-03-01 First set
1 3 2021-01-01 2021-03-01 First set
1 1 2021-03-02 2021-03-31 Second set
1 2 2021-03-02 2021-03-31 Second set
1 4 2021-04-01 2021-04-02 Fourth set
1 5 2021-04-01 2021-04-02 Fourth set
1 1 2021-04-03 2021-06-01 Second set
1 2 2021-04-03 2021-06-01 Second set
1 1 2021-06-02 2021-12-31 Third set
1 2 2021-06-02 2021-12-31 Third set
1 3 2021-06-02 2021-12-31 Third set
2 6 2021-01-01 2021-12-31 First set (FooId = 2)

我知道这可以通过游标或 while 循环实现,但我正在寻找更 performant/elegant 的解决方案。

兼容级别为:130

It's ok to have empty time frames when no records exists in FooBar.

这是否意味着没有空帧的解决方案也可以接受table?
如果是这样,那么 FooId = 1 的第三组也定义了一个 BarId = 3 用于周期 2021-03-02 -> 2021-03-31 例如。

示例数据

稍微调整了数据模型,结果中没有这些时间戳 (00:00:00.000)。 还添加了一个集合标识符 (FooBar.SetId) 以便于溯源。

CREATE TABLE Foo(
    FooId INT NOT NULL,
    ValidFrom DATE/*TIME*/ NOT NULL,
    ValidUntil DATE/*TIME*/ NOT NULL
)

CREATE TABLE FooBar (
    FooId INT NOT NULL,
    BarId INT NOT NULL,
    ValidFrom DATE/*TIME*/ NOT NULL,
    ValidUntil DATE/*TIME*/ NOT NULL,
    Priority TINYINT NOT NULL,
    SetId nvarchar(5)
)

INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES 
  (1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')

INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority, SetId)
VALUES 
-- First set for FooId = 1 with prio 1
  (1, 1, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 2, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 3, '2021-01-01', '2021-03-01', 1, 'Set 1')
-- Second set for FooId = 1 with prio 2 
, (1, 1, '2021-02-01', '2021-06-01', 2, 'Set 2')
, (1, 2, '2021-02-01', '2021-06-01', 2, 'Set 2')
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 2, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 3, '2021-01-01', '2021-12-31', 3, 'Set 3')
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1, 'Set 4')
, (1, 5, '2021-04-01', '2021-04-02', 1, 'Set 4')
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3, 'Set 1')

解决方案

  1. 常见的 table 表达式 (CTE) ValidFromValidPeriodFooFooBar 中的所有周期信息剪切到最小的单个周期中。
  2. 上一步还为使用 exists 子句删除的每个 FooId 生成了一个额外的尾随不完整句点。
  3. 然后为每个单独的周期获取具有第一个优先级值的 FooBar 记录(也就是说不允许具有更小优先级的类似记录:not exists ... fb2.Priority < fb.Priority)。

这给出:

with ValidFrom as
(
  select f.FooId,
         f.ValidFrom
  from Foo f
    union
  select f.FooId,
         dateadd(day, 1, f.ValidUntil)
  from Foo f
    union
  select fb.FooId,
         fb.ValidFrom
  from FooBar fb
    union
  select fb.FooId,
         dateadd(day, 1, fb.ValidUntil)
  from Foobar fb
),
ValidPeriod as
(
  select vf.FooId,
         vf.ValidFrom,
         dateadd(day, -1, lead(vf.ValidFrom) over(partition by vf.FooId order by vf.ValidFrom)) as ValidUntil
  from ValidFrom vf
)
select vp.FooId,
       fb.BarId,
       vp.ValidFrom,
       vp.ValidUntil,
     --fb.ValidFrom,
     --fb.ValidUntil,
     --fb.Priority,
       fb.SetId
from ValidPeriod vp
left join FooBar fb
  on  fb.FooId = vp.FooId
  and fb.ValidFrom <= vp.ValidUntil
  and fb.ValidUntil >= vp.ValidFrom
  and not exists ( select 'x'
                   from FooBar fb2
                   where fb2.FooId = fb.FooId
                     and fb2.BarId = fb.BarId
                     and fb2.ValidFrom <= vp.ValidUntil
                     and fb2.ValidUntil >= vp.ValidFrom
                     and fb2.Priority < fb.Priority )
where exists ( select 'x'
               from ValidPeriod vp2
               where vp2.FooId = vp.FooId
                 and vp2.ValidFrom > vp.ValidFrom )
order by vp.FooId,
         vp.ValidFrom,
         fb.BarId;

结果

此结果包含的经期信息多于您在预期结果中请求的信息。从第一个 CTE 中删除带有 Foounion 将删除 null 值并将周期限制为仅在 FooBar 中可用的周期信息(实际上这会消除Foo 完全来自解决方案)。

vp.ValidFromvp.ValidUntil 作为结果周期:

FooId  BarId  ValidFrom   ValidUntil  SetId
-----  -----  ----------  ----------  -----
1      null   2020-01-01  2020-12-31  null  -- extra row
1      1      2021-01-01  2021-01-31  Set 1
1      2      2021-01-01  2021-01-31  Set 1
1      3      2021-01-01  2021-01-31  Set 1
1      1      2021-02-01  2021-03-01  Set 1 -- extra row
1      2      2021-02-01  2021-03-01  Set 1 -- extra row
1      3      2021-02-01  2021-03-01  Set 1 -- extra row
1      1      2021-03-02  2021-03-31  Set 2
1      2      2021-03-02  2021-03-31  Set 2
1      3      2021-03-02  2021-03-31  Set 3 -- extra row
1      1      2021-04-01  2021-04-02  Set 2 -- extra row
1      2      2021-04-01  2021-04-02  Set 2 -- extra row
1      3      2021-04-01  2021-04-02  Set 3 -- extra row
1      4      2021-04-01  2021-04-02  Set 4
1      5      2021-04-01  2021-04-02  Set 4
1      1      2021-04-03  2021-06-01  Set 2
1      2      2021-04-03  2021-06-01  Set 2
1      3      2021-04-03  2021-06-01  Set 3 -- extra row
1      1      2021-06-02  2021-12-31  Set 3
1      2      2021-06-02  2021-12-31  Set 3
1      3      2021-06-02  2021-12-31  Set 3
2      null   2020-01-01  2020-12-31  null  -- extra row
2      6      2021-01-01  2021-04-02  Set 1
2      null   2021-04-03  2021-06-30  null  -- extra row

fb.ValidFromfb.ValidUntil 作为结果周期:

FooId  BarId  ValidFrom   ValidUntil  SetId
-----  -----  ----------  ----------  -----
1      null   null        null        null  -- extra row
1      1      2021-01-01  2021-03-01  Set 1
1      2      2021-01-01  2021-03-01  Set 1
1      3      2021-01-01  2021-03-01  Set 1
1      1      2021-01-01  2021-03-01  Set 1 -- extra row
1      2      2021-01-01  2021-03-01  Set 1 -- extra row
1      3      2021-01-01  2021-03-01  Set 1 -- extra row
1      1      2021-02-01  2021-06-01  Set 2
1      2      2021-02-01  2021-06-01  Set 2
1      3      2021-01-01  2021-12-31  Set 3 -- extra row
1      1      2021-02-01  2021-06-01  Set 2 -- extra row
1      2      2021-02-01  2021-06-01  Set 2 -- extra row
1      3      2021-01-01  2021-12-31  Set 3 -- extra row
1      4      2021-04-01  2021-04-02  Set 4
1      5      2021-04-01  2021-04-02  Set 4
1      1      2021-02-01  2021-06-01  Set 2
1      2      2021-02-01  2021-06-01  Set 2
1      3      2021-01-01  2021-12-31  Set 3 -- extra row
1      1      2021-01-01  2021-12-31  Set 3
1      2      2021-01-01  2021-12-31  Set 3
1      3      2021-01-01  2021-12-31  Set 3
2      null   null        null        null  -- extra row
2      6      2021-01-01  2021-04-02  Set 1
2      null   null        null        null  -- extra row

Fiddle 查看实际情况。