排除记录组——如果数量增加

Exclude group of records—if number ever goes up

我有路检table:

INSPECTION_ID         ROAD_ID       INSP_DATE CONDITION_RATING
--------------------- ------------- --------- ----------------
506411                3040          01-JAN-81               15
508738                3040          14-APR-85               15
512461                3040          22-MAY-88               14
515077                3040          17-MAY-91               14 -- all ok

505967                3180          01-MAY-81               11
507655                3180          13-APR-85                9
512374                3180          11-MAY-88               17 <-- goes up; NOT ok
515626                3180          25-APR-91             16.5

502798                3260          01-MAY-83               14
508747                3260          13-APR-85               13
511373                3260          11-MAY-88               12
514734                3260          25-APR-91               12  -- all ok

我想编写一个排除整条道路的查询——如果道路状况随着时间的推移而恶化。例如,排除道路 3180,因为条件从 9 变为 17(异常)。

问题:

我如何使用 Oracle SQL 做到这一点?


示例数据:db<>fiddle

这是一种选择:

  • 找到“下一个”condition_rating 值(在同一个 road_id 中 - 这是 partition by 子句,按 insp_date 排序)
  • return road_id 其“下一个”和“当前”之间的差异 condition_rating 小于零

SQL> with temp as
  2    (select road_id,
  3            condition_rating,
  4            nvl(lead(condition_rating) over (partition by road_id order by insp_date),
  5                condition_rating) next_cr
  6     from test
  7    )
  8  select distinct road_id
  9  from temp
 10  where condition_rating - next_cr < 0;

   ROAD_ID
----------
      3180

SQL>

这是一个类似于@Littlefoot 的答案的答案:

with insp as (
select 
    road_id,
    condition_rating,
    insp_date,
    case when   condition_rating   >   lag(condition_rating,1) over(partition by road_id order by insp_date)   then 'Y'   end as condition_goes_up
from 
    test_data
)

select
    insp.*
from
    insp
left join
    (
    select distinct
        road_id,
        condition_goes_up
    from
        insp
    where
        condition_goes_up = 'Y'
    ) insp_flag
    on insp.road_id = insp_flag.road_id
where
    insp_flag.condition_goes_up is null

--Note: I removed the ORDER BY, because I think the window function already orders the rows the way I want.

db<>fiddle


编辑:

这是一个类似于@Markus Winand 所做的版本:

insp as (
select 
    road_id,
    condition_rating,
    insp_date,
    case when   condition_rating   >   lag(condition_rating,1) over(partition by road_id order by insp_date)   then 'Y'   end as condition_goes_up
from 
    test_data
)

select
    insp_tagged.*
from
    (
    select
        insp.*,
        count(condition_goes_up) over(partition by road_id) as condition_goes_up_count
    from
        insp
    ) insp_tagged
where
    condition_goes_up_count = 0 

我最终选择了那个选项。

db<>fiddle

基于 OP 自己的回答,这使得预期结果更加明确。

在我避免自连接的永久冲动中,我会选择嵌套的 window 函数:

SELECT road_id, condition_rating, insp_date
  FROM ( SELECT prev.*
              , COUNT(CASE WHEN condition_rating < next_cr THEN 1 END) OVER(PARTITION BY road_id) bad
           FROM (select t.*
                      , lead(condition_rating) over (partition by road_id order by insp_date) next_cr
                  from t
                ) prev
       ) tagged
WHERE bad = 0
ORDER BY road_id, insp_date

注意

  • lead() 为查询认为由 case 表达式标记坏行的最后一行提供 nullcondition_rating < next_cr — 如果 next_crnull,条件不会为真,因此 case 将其映射为“不错”。
  • case只是为了模仿filter子句:https://modern-sql.com/feature/filter
  • MATCH_RECOGNIZE 可能是这个问题的另一种选择,但由于缺少 '^' 和 '$' 我担心回溯可能会导致更多值得的问题。
  • 嵌套 window 函数如果使用兼容的 OVER 子句,通常不会对性能造成太大影响,就像在这个查询中一样。