如何 select 给定 ID 和日期的最后一个非空值

How do I select the last non-empty value for a given ID and date

我正在尝试 select 列中非空白(技术上不是非空)的最后一个值,并在之后的每个日期 select 它,直到该值发生变化,然后 select 该值等等。

我有:

company_id date sales_stage previous_sales_stage
1 2022-05-20 00:00:00.000 a NULL
1 2022-05-19 00:00:00.000 b NULL
1 2022-05-18 00:00:00.000 c NULL
1 2022-05-17 00:00:00.000 c NULL
1 2022-05-16 00:00:00.000 c NULL
1 2022-05-15 00:00:00.000 d NULL
1 2022-05-14 00:00:00.000 d NULL
1 2022-05-13 00:00:00.000 d NULL
1 2022-05-12 00:00:00.000 e NULL
1 2022-05-11 00:00:00.000 e NULL

我想要的:

company_id date sales_stage previous_sales_stage
1 2022-05-20 00:00:00.000 a b
1 2022-05-19 00:00:00.000 b c
1 2022-05-18 00:00:00.000 c d
1 2022-05-17 00:00:00.000 c d
1 2022-05-16 00:00:00.000 c d
1 2022-05-15 00:00:00.000 d e
1 2022-05-14 00:00:00.000 d e
1 2022-05-13 00:00:00.000 d e
1 2022-05-12 00:00:00.000 e NULL
1 2022-05-11 00:00:00.000 e NULL

这是一份摘要 table,与所有公司共享(因此在给定日期会有多个公司 ID 和阶段)并且每天从存储过程中计算。如果没有值,NULL也可以。

这里有一些 T-SQL 来创建一个临时 table 来重新创建这个例子:

DROP TABLE IF EXISTS #blah

    CREATE TABLE #blah
(
    company_id INT
  , [date] DATETIME
  , sales_stage VARCHAR(50)
  , previous_sales_stage VARCHAR(50)
);


INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'a',CAST(GETDATE() AS DATE))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'b',dateadd(d,-1,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'c',dateadd(d,-2,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'c',dateadd(d,-3,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'c',dateadd(d,-4,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'d',dateadd(d,-5,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'d',dateadd(d,-6,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'d',dateadd(d,-7,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'e',dateadd(d,-8,cast(getdate() as date)))
INSERT INTO #blah (company_id, sales_stage, [date]) VALUES (1,'e',dateadd(d,-9,cast(getdate() as date)))


SELECT * FROM #blah

UPDATE #blah SET previous_sales_stage = 'b' WHERE company_id = 1 AND date = '2022-05-20 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'c' WHERE company_id = 1 AND date = '2022-05-19 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'd' WHERE company_id = 1 AND date = '2022-05-18 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'd' WHERE company_id = 1 AND date = '2022-05-17 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'd' WHERE company_id = 1 AND date = '2022-05-16 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'e' WHERE company_id = 1 AND date = '2022-05-15 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'e' WHERE company_id = 1 AND date = '2022-05-14 00:00:00.000'
UPDATE #blah SET previous_sales_stage = 'e' WHERE company_id = 1 AND date = '2022-05-13 00:00:00.000'
UPDATE #blah SET previous_sales_stage = NULL WHERE company_id = 1 AND date = '2022-05-12 00:00:00.000'
UPDATE #blah SET previous_sales_stage = NULL WHERE company_id = 1 AND date = '2022-05-11 00:00:00.000'

SELECT * FROM #blah

您可以使用子查询或外部应用,如下所示:

SELECT t1.company_id, t1.date, t1.sales_stage, x.previous_sales_stage
FROM #blah t1
OUTER APPLY (
    SELECT TOP 1 t2.sales_stage AS previous_sales_stage
    FROM #blah t2 
    WHERE t2.company_id=t1.company_id
        AND t2.date<t1.date
        AND t2.sales_stage<>t1.sales_stage
    ORDER BY t2.date DESC
) x