SQL 条件预测

SQL Conditional Look Ahead

我想编写一个查询来标识满足条件的有序集中的“下一个”值。 LEAD/LAG 分析函数在这里似乎不适用,因为根据条件,要预测的行数是可变的(不是固定的)。下面的示例显示了示例 table (tbl) 中的预期结果(gnme 列),但该解决方案似乎并不理想。希望这里有人可以为此类问题提供更优雅的解决方案。提前致谢。

注意在这个例子中,第 1-3 行如何识别第 4 行中的 nme mike,以及第 6-7 行如何识别 nme 迈克尔 在第 8 行。

create table tbl (
  id number
  ,nme varchar(255)
)
;

insert into tbl (id, nme) values (1,'unknown');
insert into tbl (id, nme) values (2,'unknown');
insert into tbl (id, nme) values (3,'unknown');
insert into tbl (id, nme) values (4,'mike');
insert into tbl (id, nme) values (5,'mike');
insert into tbl (id, nme) values (6,'unknown');
insert into tbl (id, nme) values (7,'unknown');
insert into tbl (id, nme) values (8,'michael');
insert into tbl (id, nme) values (9,'michael');
insert into tbl (id, nme) values (10,'michael');
insert into tbl (id, nme) values (11,'unknown');
 
select
  id
  ,nme
  ,CASE WHEN nme = 'unknown' THEN 
          NVL
          (
           (SELECT b.nme 
            FROM tbl b 
            WHERE 
              b.nme <> 'unknown'
              AND a.id < b.id 
            ORDER BY id 
            OFFSET 0 ROWS FETCH NEXT 1 ROW ONLY
           )
           , nme
          ) 
        ELSE nme 
        END AS gnme
FROM
  tbl a
;

+----+---------+---------+
| id | nme     | gnme    |
+----+---------+---------+
| 1  | unknown | mike    |
+----+---------+---------+
| 2  | unknown | mike    |
+----+---------+---------+
| 3  | unknown | mike    |
+----+---------+---------+
| 4  | mike    | mike    |
+----+---------+---------+
| 5  | mike    | mike    |
+----+---------+---------+
| 6  | unknown | michael |
+----+---------+---------+
| 7  | unknown | michael |
+----+---------+---------+
| 8  | michael | michael |
+----+---------+---------+
| 9  | michael | michael |
+----+---------+---------+
| 10 | michael | michael |
+----+---------+---------+
| 11 | unknown | unknown |
+----+---------+---------+

当名字未知时,您需要下一个 non-unknow 名字。

Oracle 是少数支持 window 函数 lead()lag()ignore nulls 选项的数据库之一。这是一项强大的功能,对您的用例非常有用:

select 
   id,
   nme,
   case when nme = 'unknown'
       then lead(nullif(nme,'unknown') ignore nulls, 1, 'unknown') over(order by id)
       else nme
   end gnme
from tbl

lead() 中的 case 表达式将值 'unknow' 转换为 null,然后函数带来下一个非 null 值(默认为未知如果有 none 可用)。

可以使用first_value解析函数:

select
  id
  ,nme
  ,nvl(
        first_value(nullif(nme,'unknown') ignore nulls)over(order by id ROWS between current row and unbounded following) 
        ,'unknown')
        AS gnme
FROM
  tbl a
;

要比较的完整示例:

select
  id
  ,nme
  ,CASE WHEN nme = 'unknown' THEN 
          NVL
          (
           (SELECT b.nme 
            FROM tbl b 
            WHERE 
              b.nme <> 'unknown'
              AND a.id < b.id 
            ORDER BY id 
            OFFSET 0 ROWS FETCH NEXT 1 ROW ONLY
           )
           , nme
          ) 
        ELSE nme 
        END AS gnme
    ,nvl(
        first_value(nullif(nme,'unknown') ignore nulls)over(order by id ROWS between current row and unbounded following) 
        ,'unknown')
        AS gnme_2
FROM
  tbl a
;

结果:

        ID NME        GNME       GNME_2
---------- ---------- ---------- ----------
         1 unknown    mike       mike
         2 unknown    mike       mike
         3 unknown    mike       mike
         4 mike       mike       mike
         5 mike       mike       mike
         6 unknown    michael    michael
         7 unknown    michael    michael
         8 michael    michael    michael
         9 michael    michael    michael
        10 michael    michael    michael
        11 unknown    unknown    unknown

11 rows selected.

您也可以只使用 LAST_VALUE() 和 IGNORE NULLS:

WITH
-- your input
tbl(id,nme) AS (
          SELECT 1,'unknown'
UNION ALL SELECT 2,'unknown'
UNION ALL SELECT 3,'unknown'
UNION ALL SELECT 4,'mike'
UNION ALL SELECT 5,'mike'
UNION ALL SELECT 6,'unknown'
UNION ALL SELECT 7,'unknown'
UNION ALL SELECT 8,'michael'
UNION ALL SELECT 9,'michael'
UNION ALL SELECT 10,'michael'
UNION ALL SELECT 11,'unknown'
)
SELECT
  *
, NVL(
    LAST_VALUE(NULLIF(nme,'unknown') IGNORE NULLS) OVER(
      ORDER BY id DESC
    )
  , 'unknown'
  ) AS gnme
FROM tbl
ORDER BY id;
-- out  id |   nme   |  gnme   
-- out ----+---------+---------
-- out   1 | unknown | mike
-- out   2 | unknown | mike
-- out   3 | unknown | mike
-- out   4 | mike    | mike
-- out   5 | mike    | mike
-- out   6 | unknown | michael
-- out   7 | unknown | michael
-- out   8 | michael | michael
-- out   9 | michael | michael
-- out  10 | michael | michael
-- out  11 | unknown | unknown