排除记录组——如果数量增加
Exclude group of records—if number ever goes up
我有路检table:
INSPECTION_ID ROAD_ID INSP_DATE CONDITION_RATING
--------------------- ------------- --------- ----------------
506411 3040 01-JAN-81 15
508738 3040 14-APR-85 15
512461 3040 22-MAY-88 14
515077 3040 17-MAY-91 14 -- all ok
505967 3180 01-MAY-81 11
507655 3180 13-APR-85 9
512374 3180 11-MAY-88 17 <-- goes up; NOT ok
515626 3180 25-APR-91 16.5
502798 3260 01-MAY-83 14
508747 3260 13-APR-85 13
511373 3260 11-MAY-88 12
514734 3260 25-APR-91 12 -- all ok
我想编写一个排除整条道路的查询——如果道路状况随着时间的推移而恶化。例如,排除道路 3180
,因为条件从 9 变为 17(异常)。
问题:
我如何使用 Oracle SQL 做到这一点?
示例数据:db<>fiddle
这是一种选择:
- 找到“下一个”
condition_rating
值(在同一个 road_id
中 - 这是 partition by
子句,按 insp_date
排序)
- return
road_id
其“下一个”和“当前”之间的差异 condition_rating
小于零
SQL> with temp as
2 (select road_id,
3 condition_rating,
4 nvl(lead(condition_rating) over (partition by road_id order by insp_date),
5 condition_rating) next_cr
6 from test
7 )
8 select distinct road_id
9 from temp
10 where condition_rating - next_cr < 0;
ROAD_ID
----------
3180
SQL>
这是一个类似于@Littlefoot 的答案的答案:
with insp as (
select
road_id,
condition_rating,
insp_date,
case when condition_rating > lag(condition_rating,1) over(partition by road_id order by insp_date) then 'Y' end as condition_goes_up
from
test_data
)
select
insp.*
from
insp
left join
(
select distinct
road_id,
condition_goes_up
from
insp
where
condition_goes_up = 'Y'
) insp_flag
on insp.road_id = insp_flag.road_id
where
insp_flag.condition_goes_up is null
--Note: I removed the ORDER BY, because I think the window function already orders the rows the way I want.
编辑:
这是一个类似于@Markus Winand 所做的版本:
insp as (
select
road_id,
condition_rating,
insp_date,
case when condition_rating > lag(condition_rating,1) over(partition by road_id order by insp_date) then 'Y' end as condition_goes_up
from
test_data
)
select
insp_tagged.*
from
(
select
insp.*,
count(condition_goes_up) over(partition by road_id) as condition_goes_up_count
from
insp
) insp_tagged
where
condition_goes_up_count = 0
我最终选择了那个选项。
基于 OP 自己的回答,这使得预期结果更加明确。
在我避免自连接的永久冲动中,我会选择嵌套的 window 函数:
SELECT road_id, condition_rating, insp_date
FROM ( SELECT prev.*
, COUNT(CASE WHEN condition_rating < next_cr THEN 1 END) OVER(PARTITION BY road_id) bad
FROM (select t.*
, lead(condition_rating) over (partition by road_id order by insp_date) next_cr
from t
) prev
) tagged
WHERE bad = 0
ORDER BY road_id, insp_date
注意
lead()
为查询认为由 case
表达式标记坏行的最后一行提供 null
:condition_rating < next_cr
— 如果 next_cr
是null
,条件不会为真,因此 case
将其映射为“不错”。
case
只是为了模仿filter
子句:https://modern-sql.com/feature/filter
MATCH_RECOGNIZE
可能是这个问题的另一种选择,但由于缺少 '^' 和 '$' 我担心回溯可能会导致更多值得的问题。
- 嵌套 window 函数如果使用兼容的
OVER
子句,通常不会对性能造成太大影响,就像在这个查询中一样。
我有路检table:
INSPECTION_ID ROAD_ID INSP_DATE CONDITION_RATING
--------------------- ------------- --------- ----------------
506411 3040 01-JAN-81 15
508738 3040 14-APR-85 15
512461 3040 22-MAY-88 14
515077 3040 17-MAY-91 14 -- all ok
505967 3180 01-MAY-81 11
507655 3180 13-APR-85 9
512374 3180 11-MAY-88 17 <-- goes up; NOT ok
515626 3180 25-APR-91 16.5
502798 3260 01-MAY-83 14
508747 3260 13-APR-85 13
511373 3260 11-MAY-88 12
514734 3260 25-APR-91 12 -- all ok
我想编写一个排除整条道路的查询——如果道路状况随着时间的推移而恶化。例如,排除道路 3180
,因为条件从 9 变为 17(异常)。
问题:
我如何使用 Oracle SQL 做到这一点?
示例数据:db<>fiddle
这是一种选择:
- 找到“下一个”
condition_rating
值(在同一个road_id
中 - 这是partition by
子句,按insp_date
排序) - return
road_id
其“下一个”和“当前”之间的差异condition_rating
小于零
SQL> with temp as
2 (select road_id,
3 condition_rating,
4 nvl(lead(condition_rating) over (partition by road_id order by insp_date),
5 condition_rating) next_cr
6 from test
7 )
8 select distinct road_id
9 from temp
10 where condition_rating - next_cr < 0;
ROAD_ID
----------
3180
SQL>
这是一个类似于@Littlefoot 的答案的答案:
with insp as (
select
road_id,
condition_rating,
insp_date,
case when condition_rating > lag(condition_rating,1) over(partition by road_id order by insp_date) then 'Y' end as condition_goes_up
from
test_data
)
select
insp.*
from
insp
left join
(
select distinct
road_id,
condition_goes_up
from
insp
where
condition_goes_up = 'Y'
) insp_flag
on insp.road_id = insp_flag.road_id
where
insp_flag.condition_goes_up is null
--Note: I removed the ORDER BY, because I think the window function already orders the rows the way I want.
编辑:
这是一个类似于@Markus Winand 所做的版本:
insp as (
select
road_id,
condition_rating,
insp_date,
case when condition_rating > lag(condition_rating,1) over(partition by road_id order by insp_date) then 'Y' end as condition_goes_up
from
test_data
)
select
insp_tagged.*
from
(
select
insp.*,
count(condition_goes_up) over(partition by road_id) as condition_goes_up_count
from
insp
) insp_tagged
where
condition_goes_up_count = 0
我最终选择了那个选项。
基于 OP 自己的回答,这使得预期结果更加明确。
在我避免自连接的永久冲动中,我会选择嵌套的 window 函数:
SELECT road_id, condition_rating, insp_date
FROM ( SELECT prev.*
, COUNT(CASE WHEN condition_rating < next_cr THEN 1 END) OVER(PARTITION BY road_id) bad
FROM (select t.*
, lead(condition_rating) over (partition by road_id order by insp_date) next_cr
from t
) prev
) tagged
WHERE bad = 0
ORDER BY road_id, insp_date
注意
lead()
为查询认为由case
表达式标记坏行的最后一行提供null
:condition_rating < next_cr
— 如果next_cr
是null
,条件不会为真,因此case
将其映射为“不错”。case
只是为了模仿filter
子句:https://modern-sql.com/feature/filterMATCH_RECOGNIZE
可能是这个问题的另一种选择,但由于缺少 '^' 和 '$' 我担心回溯可能会导致更多值得的问题。- 嵌套 window 函数如果使用兼容的
OVER
子句,通常不会对性能造成太大影响,就像在这个查询中一样。