递归查询使用初始查询中返回的日期作为后续查询中的限制

Recursive query to use a date returned in initial query as limit in subsequent query

我有一项业务需要根据任务的使用来预测何时需要完成特定任务。
例如,您需要每 3000 英里更换一次机油。有些日子你开车 300 英里,有些日子你开车 500 英里。当你达到 3000 时,你更换机油,然后重新启动计数器。根据预计使用量 table,return 一组所有换油日期。

我可以在 table 值函数或其他一些 'coded' 解决方案中做到这一点。
但我认为我可以在一个语句中完成,也许是一个递归 cte。
我在 'joining' 下一个日期进入递归部分的 WHERE 时遇到困难。
SQL 根本不喜欢递归 CTE 中的 'TOP 1'。 :)

我想要一套这样的:

这是我得到的:

WITH cte_MilesMX (RateDate,RunningRateMiles)
AS
(
    -- Initial query
    SELECT TOP 1 *
    FROM (
      SELECT 
      RateDate,
      SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
      FROM dbo.RatesbyDay
      WHERE RateDate > '2020-01-01') q1
    WHERE q1.RunningRateMiles >= 3000
    UNION ALL
    -- Recursive part
    SELECT TOP 1 *
    FROM (
      SELECT 
      rbd.RateDate,
      SUM(RateMiles) OVER (ORDER BY rbd.RateDate) AS RunningRateMiles
      FROM dbo.RatesbyDay rbd
      JOIN cte_MilesMX cte
        ON 1 = 1
      WHERE rbd.RateDate > cte.RateDate) q1
    WHERE q1.RunningRateMiles >= 3000
)

SELECT *
FROM   cte_MilesMX

如果你想愚弄这个,这里是例子:
任何帮助将不胜感激。 谢谢

CREATE TABLE RatesbyDay(
    RateDate DATE,
    RateMiles INT);
INSERT INTO RatesbyDay VALUES ('2020-01-01',600)
INSERT INTO RatesbyDay VALUES ('2020-01-02',450)
INSERT INTO RatesbyDay VALUES ('2020-01-03',370)
INSERT INTO RatesbyDay VALUES ('2020-01-04',700)
INSERT INTO RatesbyDay VALUES ('2020-01-05',100)
INSERT INTO RatesbyDay VALUES ('2020-01-06',480)
INSERT INTO RatesbyDay VALUES ('2020-01-07',430)
INSERT INTO RatesbyDay VALUES ('2020-01-08',200)
INSERT INTO RatesbyDay VALUES ('2020-01-09',590)
INSERT INTO RatesbyDay VALUES ('2020-01-10',380)
INSERT INTO RatesbyDay VALUES ('2020-01-11',220)
INSERT INTO RatesbyDay VALUES ('2020-01-12',320)
INSERT INTO RatesbyDay VALUES ('2020-01-13',360)
INSERT INTO RatesbyDay VALUES ('2020-01-14',600)
INSERT INTO RatesbyDay VALUES ('2020-01-15',450)
INSERT INTO RatesbyDay VALUES ('2020-01-16',475)
INSERT INTO RatesbyDay VALUES ('2020-01-17',300)
INSERT INTO RatesbyDay VALUES ('2020-01-18',190)
INSERT INTO RatesbyDay VALUES ('2020-01-19',435)
INSERT INTO RatesbyDay VALUES ('2020-01-20',285)
INSERT INTO RatesbyDay VALUES ('2020-01-21',350)
INSERT INTO RatesbyDay VALUES ('2020-01-22',410)
INSERT INTO RatesbyDay VALUES ('2020-01-23',250)
INSERT INTO RatesbyDay VALUES ('2020-01-24',300)
INSERT INTO RatesbyDay VALUES ('2020-01-25',250)
INSERT INTO RatesbyDay VALUES ('2020-01-26',650)
INSERT INTO RatesbyDay VALUES ('2020-01-27',180)
INSERT INTO RatesbyDay VALUES ('2020-01-28',280)
INSERT INTO RatesbyDay VALUES ('2020-01-29',200)
INSERT INTO RatesbyDay VALUES ('2020-01-30',100)
INSERT INTO RatesbyDay VALUES ('2020-01-31',100)

-- this returns the 1st oil change assuming we just changed it on 1-1-2020
SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-01') q1
WHERE q1.RunningRateMiles >= 3000

-- the above query returned 1-9-2020 as the oil change, so when is the next one.
SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-09') q1
WHERE q1.RunningRateMiles >= 3000

-- etc. etc.
SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-17') q1
WHERE q1.RunningRateMiles >= 3000

SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-26') q1
WHERE q1.RunningRateMiles >= 3000

这不是递归 CTE,但它确实可以完成您想要做的事情。该技术有几个不同的名称......通常是“Quirky Update”或“Ordered Update”。

首先,请注意我向您的 table 添加了两个新列和一个聚簇索引。它们实际上是必要的,但如果不愿意或不能修改现有的 table,这与 #TempTable 一样有效。

更多详细信息,请参阅Solving the Running Total and Ordinal Rank Problems (Rewritten)

另外...公平警告,由于 Microsoft 不保证它会按预期工作,因此这项技术并非没有缺点。

USE tempdb;
GO

IF OBJECT_ID('tempdb.dbo.RatesByDay', 'U') IS NOT NULL 
BEGIN DROP TABLE tempdb.dbo.RatesByDay; END;
GO

CREATE TABLE tempdb.dbo.RatesByDay (
    RateDate date NOT NULL
        CONSTRAINT pk_RatesByDay PRIMARY KEY CLUSTERED (RateDate), -- clustered index is needed to control the direction of the update.
    RateMiles int NOT NULL,
    IsChangeDay bit NULL,
    MilesSinceLastChange int NULL
    );
GO

INSERT tempdb.dbo.RatesByDay (RateDate, RateMiles) VALUES
    ('2020-01-01',600),('2020-01-02',450),('2020-01-03',370),('2020-01-04',700),('2020-01-05',100),('2020-01-06',480),
    ('2020-01-07',430),('2020-01-08',200),('2020-01-09',590),('2020-01-10',380),('2020-01-11',220),('2020-01-12',320),
    ('2020-01-13',360),('2020-01-14',600),('2020-01-15',450),('2020-01-16',475),('2020-01-17',300),('2020-01-18',190),
    ('2020-01-19',435),('2020-01-20',285),('2020-01-21',350),('2020-01-22',410),('2020-01-23',250),('2020-01-24',300),
    ('2020-01-25',250),('2020-01-26',650),('2020-01-27',180),('2020-01-28',280),('2020-01-29',200),('2020-01-30',100),
    ('2020-01-31',100);

--=====================================================================================================================

DECLARE 
    @RunningMiles int = 0,
    @Anchor date;

UPDATE rbd SET          
    @RunningMiles = rbd.MilesSinceLastChange = CASE WHEN @RunningMiles < 3000 THEN @RunningMiles ELSE 0 END + rbd.RateMiles,
    rbd.IsChangeDay = CASE WHEN @RunningMiles < 3000 THEN 0 ELSE 1 END,
    @Anchor = rbd.RateDate
FROM
    dbo.RatesByDay rbd WITH (TABLOCKX, INDEX (1))
WHERE 1 = 1
    AND rbd.RateDate > '2020-01-01'
OPTION (MAXDOP 1);

-------------------------------------

SELECT * FROM dbo.RatesByDay rbd;

结果...

RateDate   RateMiles   IsChangeDay MilesSinceLastChange
---------- ----------- ----------- --------------------
2020-01-01 600         NULL        NULL
2020-01-02 450         0           450
2020-01-03 370         0           820
2020-01-04 700         0           1520
2020-01-05 100         0           1620
2020-01-06 480         0           2100
2020-01-07 430         0           2530
2020-01-08 200         0           2730
2020-01-09 590         1           3320
2020-01-10 380         0           380
2020-01-11 220         0           600
2020-01-12 320         0           920
2020-01-13 360         0           1280
2020-01-14 600         0           1880
2020-01-15 450         0           2330
2020-01-16 475         0           2805
2020-01-17 300         1           3105
2020-01-18 190         0           190
2020-01-19 435         0           625
2020-01-20 285         0           910
2020-01-21 350         0           1260
2020-01-22 410         0           1670
2020-01-23 250         0           1920
2020-01-24 300         0           2220
2020-01-25 250         0           2470
2020-01-26 650         1           3120
2020-01-27 180         0           180
2020-01-28 280         0           460
2020-01-29 200         0           660
2020-01-30 100         0           760
2020-01-31 100         0           860

您可以使用递归查询来执行此操作:

with 
    data as (select r.*, row_number() over(order by ratedate) rn from ratesbyday r),
    cte as (
        select d.*, ratemiles total, ratemiles newtotal from data d where rn = 1
        union all
        select d.*, 
            c.newtotal + d.ratemiles,
            case when c.newtotal < 3000 and c.newtotal + d.ratemiles >= 3000 then 0 else c.newtotal + d.ratemiles end
        from cte c
        inner join data d on d.rn = c.rn + 1
    )
select ratedate, ratemiles, total 
from cte 
where newtotal = 0
order by ratedate

查询从枚举行开始。然后,它从“第一个”开始,迭代地遍历它们;每次超过 3000 英里阈值时,我们都会重置 运行 英里计数。然后我们可以过滤“重置”行。

Demo on DB Fiddle:

ratedate   | ratemiles | total
:--------- | --------: | ----:
2020-01-07 |       430 |  3130
2020-01-15 |       450 |  3120
2020-01-25 |       250 |  3245

如果您的数据集中可能有超过 100 行,您需要在查询的最后添加 option (maxrecursion 0)

在这种情况下,我会使用滚动聚合,然后使用 mod 运算符找到它达到 3000 间隔的点。

使用上面的 table desc 和插入是一个例子:

-- When the mod value "resets" then the oil change is due, check this using LAG
SELECT 
agg.RateDate
,agg.RateMiles
,agg.MilesAgg
,agg.MilesAgg%3000 AS ModValue
,CASE WHEN agg.MilesAgg%3000 < LAG(agg.MilesAgg) OVER(ORDER BY agg.RateDate)%3000
THEN 'Due'
ELSE 'NotDue'
END

FROM
(
--Get the rolling total of miles
SELECT
rbd.RateDate
,rbd.RateMiles
,SUM(rbd.RateMiles) OVER(ORDER BY rbd.RateDate ROWS UNBOUNDED PRECEDING) AS MilesAgg
FROM #RatesByDay rbd
) agg

结果,第一天将 600 英里计算为换油后

RateDate    Mi  MiAgg   Mod     IsDue?
--------------------------------------
2020-01-01  600 600     600     NotDue
2020-01-02  450 1050    1050    NotDue
2020-01-03  370 1420    1420    NotDue
2020-01-04  700 2120    2120    NotDue
2020-01-05  100 2220    2220    NotDue
2020-01-06  480 2700    2700    NotDue
2020-01-07  430 3130    130     Due
2020-01-08  200 3330    330     NotDue
2020-01-09  590 3920    920     NotDue
2020-01-10  380 4300    1300    NotDue
2020-01-11  220 4520    1520    NotDue
2020-01-12  320 4840    1840    NotDue
2020-01-13  360 5200    2200    NotDue
2020-01-14  600 5800    2800    NotDue
2020-01-15  450 6250    250     Due
2020-01-16  475 6725    725     NotDue
2020-01-17  300 7025    1025    NotDue
2020-01-18  190 7215    1215    NotDue
2020-01-19  435 7650    1650    NotDue
2020-01-20  285 7935    1935    NotDue
2020-01-21  350 8285    2285    NotDue
2020-01-22  410 8695    2695    NotDue
2020-01-23  250 8945    2945    NotDue
2020-01-24  300 9245    245     Due
2020-01-25  250 9495    495     NotDue
2020-01-26  650 10145   1145    NotDue
2020-01-27  180 10325   1325    NotDue
2020-01-28  280 10605   1605    NotDue
2020-01-29  200 10805   1805    NotDue
2020-01-30  100 10905   1905    NotDue
2020-01-31  100 11005   2005    NotDue