如何在 SQL 中过滤出 partionned ordered table 中组的第一个记录为空的记录?

How to filter out in SQL the records in a partionned ordered table where first records of group are null?

数据

    ROW  YEAR  PROD   KEY   DATE
    1    2011  APPLE  TIME  2011-11-18 00:00:00.000
    2    2011  APPLE  TIME  2011-11-19 00:00:00.000
    3    2013  APPLE  NULL  2011-11-18 00:00:00.000
    4    2013  APPLE  NULL  2011-11-19 00:00:00.000
    5    2013  APPLE  TIME  2014-04-08 00:00:00.000
    6    2013  APPLE  DIM   2014-04-09 00:00:00.000
    7    2013  APPLE  TIME  2014-11-10 10:50:14.113
    8    2013  APPLE  TIME  2014-11-12 10:46:04.947
    9    2013  MELON  JAK   2011-10-17 11:01:19.657
    10   2013  MELON  TIME  2014-11-18 11:19:35.547
    11   2013  MELON  NULL  2014-11-19 11:19:35.547
    12   2013  MELON  TIME  2014-11-21 10:32:36.017
    13   2014  APPLE  JAK   2003-04-10 00:00:00.000
    14   2014  APPLE  DIM   2003-04-11 00:00:00.000
    15   2015  APPLE  TIME  2002-09-27 00:00:00.000
    16   2015  APPLE  NULL  2004-09-28 00:00:00.000

ROW 不是 table 中的列。只是为了显示我想要的记录。

问题

以上数据按(YEAR,PROD)分区,按 DATE 排序。

我需要根据以下逻辑保留除第 3 行和第 4 行之外的所有行:

每个组都必须从 KEY 不为空的记录开始

==> 否则丢弃

换句话说,我可以有:not null, null, not null, null

但我不能有 : null, not null, null, not null

预期结果

    ROW  YEAR  PROD   KEY   DATE
    1    2011  APPLE  TIME  2011-11-18 00:00:00.000
    2    2011  APPLE  TIME  2011-11-19 00:00:00.000

    5    2013  APPLE  TIME  2014-04-08 00:00:00.000
    6    2013  APPLE  DIM   2014-04-09 00:00:00.000
    7    2013  APPLE  TIME  2014-11-10 10:50:14.113
    8    2013  APPLE  TIME  2014-11-12 10:46:04.947
    9    2013  MELON  JAK   2011-10-17 11:01:19.657
    10   2013  MELON  TIME  2014-11-18 11:19:35.547
    11   2013  MELON  TIME  2014-11-19 11:19:35.547
    12   2013  MELON  TIME  2014-11-21 10:32:36.017
    13   2014  APPLE  JAK   2003-04-10 00:00:00.000
    14   2014  APPLE  DIM   2003-04-11 00:00:00.000
    15   2015  APPLE  TIME  2002-09-27 00:00:00.000
    16   2015  APPLE  TIME  2004-09-28 00:00:00.000

我想这样做,所以后来我在每个组的开头总是有一个非空键。 这样,我以后总是可以使用前一行来填充具有空值的后续记录(在本例中为 11 和 16)

如有任何意见或建议,我们将不胜感激!

可能有更好的解决方案,但本质上(如果 KEY、DATE 等不是您产品中的保留字,您可以删除方括号 - 我使用的是 TSQL):

select * 
from Tbl T1
where 
  /* Do not include if... */
  NOT (
       t1.[KEY] is null
       /* This is part of the first KEY=NULL rows for this group 
          (no preceding record with KEY<>NULL) */
        and not exists
           (select 1
            from Tbl T3
            where T3.[YEAR]=T1.[YEAR]
            and T3.PROD=T1.PROD
            and T3.[DATE] < T1.[DATE]
            and T3.[KEY] is not null
           )
       /* There are KEY<>NULL values further down */
       and exists 
           (select 1
            from Tbl T2
            where T2.[YEAR]=T1.[YEAR]
            and T2.PROD=T1.PROD
            and T2.[DATE] > T1.[DATE]
            and T2.[KEY] is not null
            )
      )

这种查询可以帮助:

select YEAR, PROD, KEY, DATE
  from (
        select YEAR, PROD, KEY, DATE, 
               MIN(CASE WHEN KEY IS NULL THEN DATE ELSE NULL END)
               OVER(PARTITION BY YEAR, PROD) AS MIN_NULL_KEY_DATE,
               ROW_NUMBER() OVER(PARTITION BY YEAR, PROD ORDER BY DATE ASC) RN
          from your_table yt
       )rpr
 where 1 = 1
   and CASE WHEN RN = 1 AND DATE = MIN_NULL_KEY_DATE THEN 0 ELSE 1 END = 1

所以我在这里尝试实现什么:当键列为空时,我们只是根据年份和产品列找到了最小日期。并检查该行是否是该组的第一行。如果 rn = 1 并且日期等于键为空时的最小日期值,则忽略它们以防万一。

下面得到你想要的输出。我正在检查无界行和当前行之间的键列的值,并且由于 NULL 具有最高等级,如果前面的行不为空,它将使用 NOT NULL 列填充字段 min_val。

select * from (
select year,prod,key1,date1
       ,min(key1) over(partition by year,prod order by date1 asc) as min_val
  from t
   )x
where x.min_val is not null   


+------+-------+------+-------------------------+---------+
| year | prod  | key1 |          date1          | min_val |
+------+-------+------+-------------------------+---------+
| 2011 | APPLE | TIME | 2011-11-18 00:00:00.000 | TIME    |
| 2011 | APPLE | TIME | 2011-11-19 00:00:00.000 | TIME    |
| 2013 | APPLE | TIME | 2014-04-08 00:00:00.000 | TIME    |
| 2013 | APPLE | DIM  | 2014-04-09 00:00:00.000 | DIM     |
| 2013 | APPLE | TIME | 2014-11-10 10:50:14.113 | DIM     |
| 2013 | APPLE | TIME | 2014-11-12 10:46:04.947 | DIM     |
| 2013 | MELON | JAK  | 2011-10-17 11:01:19.657 | JAK     |
| 2013 | MELON | TIME | 2014-11-18 11:19:35.547 | JAK     |
| 2013 | MELON |      | 2014-11-19 11:19:35.547 | JAK     |
| 2013 | MELON | TIME | 2014-11-21 10:32:36.017 | JAK     |
| 2014 | APPLE | JAK  | 2003-04-10 00:00:00.000 | JAK     |
| 2014 | APPLE | DIM  | 2003-04-11 00:00:00.000 | DIM     |
| 2015 | APPLE | TIME | 2002-09-27 00:00:00.000 | TIME    |
| 2015 | APPLE |      | 2004-09-28 00:00:00.000 | TIME    |
+------+-------+------+-------------------------+---------+

link https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ae82f64802674aa60005b8e9f534a150