LAG 函数和 NULLS

LAG functions and NULLS

如何告诉 LAG 函数获取最后一个 "not null" 值?

例如,请参阅下面的 table,其中我在 B 列和 C 列上有一些 NULL 值。 我想用最后一个非空值填充空值。我尝试通过使用 LAG 函数来做到这一点,如下所示:

case when B is null then lag (B) over (order by idx) else B end as B,

但是当我连续有两个或多个空值时,这不太有效(请参阅 C 列第 3 行的 NULL 值 - 我希望它与原始值一样为 0.50)。

知道如何实现吗? (不一定非得用LAG函数,其他想法欢迎)

一些假设:

谢谢

您可以更改 ORDER BY,强制 NULL 在您的排序中排在第一位,但这可能很昂贵...

lag(B) over (order by CASE WHEN B IS NULL THEN -1 ELSE idx END)

或者,使用子查询计算一次替换值。大套装可能更便宜,但非常笨重。
- 依赖于最后出现的所有 NULL
- LAG 不依赖于此

COALESCE(
    B,
    (
        SELECT
            sorted_not_null.B
        FROM
        (
            SELECT
                table.B,
                ROW_NUMBER() OVER (ORDER BY table.idx DESC)   AS row_id
            FROM
                table
            WHERE
                table.B IS NOT NULL
        )
           sorted_not_null
        WHERE
           sorted_not_null.row_id = 1
    )
)

(这在更大的数据集上应该比 LAG 或使用 OUTER APPLY 与相关子查询更快,因为该值只计算一次。为了整洁,您可以为变量中的每一列计算并存储 [last_known_value],然后只需使用 COALESCE(A, @last_known_A), COALESCE(B, @last_known_B), etc)

您可以使用 outer apply 运算符:

select t.id,
       t1.colA,
       t2.colB,
       t3.colC 
from table t
outer apply(select top 1 colA from table where id <= t.id and colA is not null order by id desc) t1
outer apply(select top 1 colB from table where id <= t.id and colB is not null order by id desc) t2
outer apply(select top 1 colC from table where id <= t.id and colC is not null order by id desc) t3;

无论空值或空值的数量如何,这都有效 "islands"。您可能有值,然后是空值,然后又是值,又是空值。它仍然有效。


但是,如果假设(在您的问题中)成立:

Once I have a NULL, is NULL all up to the end - so I want to fill it with the latest value.

有一个更有效的解决方案。我们只需要找到最新的(当按 idx 排序时)值。修改上述查询,从子查询中删除 where id <= t.id

select t.id,
       colA = coalesce(t.colA, t1.colA),
       colB = coalesce(t.colB, t2.colB),
       colC = coalesce(t.colC, t3.colC) 
from table t
outer apply (select top 1 colA from table 
             where colA is not null order by id desc) t1
outer apply (select top 1 colB from table 
             where colB is not null order by id desc) t2
outer apply (select top 1 colC from table 
             where colC is not null order by id desc) t3;
UPDATE table 
SET B = (@n := COALESCE(B , @n))
WHERE B is null;

如果一直到最后都是null那么可以走捷径

declare @b varchar(20) = (select top 1 b from table where b is not null order by id desc);
declare @c varchar(20) = (select top 1 c from table where c is not null order by id desc); 
select is, isnull(b,@b) as b, insull(c,@c) as c 
from table;
Select max(diff) from(
Select 
    Case when lag(a) over (order by b) is not null
    Then (a -lag(a) over (order by b)) end as diff 
     From <tbl_name> where
    <relevant conditions>
    Order by b) k

在数据库可视化工具中工作正常。