使用 LAG/LEAD 分析函数优化自连接 Oracle SQL 查询?
Optimize self-join Oracle SQL query with LAG/LEAD analytic functions?
我们有一个 Oracle SQL 查询来识别其中 table 列的值已从一条记录更改为另一条记录的记录。相关列为(ID, SOME_COLUMN, FROM_DATE, TO_DATE),其中ID不唯一,FROM_DATE和TO_DATE确定时间间隔该 ID 的特定行有效,即
(ID1, VAL1, 01/01/2016, 03/01/2016)
(ID1, VAL2, 04/01/2016, 09/01/2016)
(ID1, VAL3, 10/01/2016, 19/01/2016)
等等
我们可以使用以下自连接来实现这一点
SELECT N.ID
O.SOME_COLUMN OLD_VALUE,
N.SOME_COLUMN NEW_VALUE
FROM OUR_TABLE N, OUR_TABLE O
WHERE N.ID = O.ID
AND N.FROM_DATE - 1 = O.TO_DATE
AND N.SOME_COLUMN <> O.SOME_COLUMN
然而由于table包含1亿条记录,它非常影响性能。有没有更有效的方法来做到这一点?有人暗示了解析函数(例如 LAG),但到目前为止我们还没有找到可行的解决方案。任何想法将不胜感激
是的,您可以使用 LEAD()
获取最后一个值:
SELECT t.id,
t.some_column as OLD_VALUE,
LEAD(t.some_column) OVER(PARTITION BY t.id ORDER BY t.from_date) as NEW_VALUE
FROM YourTable t
如果您只想更改,请用另一个 select 包装它并过滤 OLD_VALUE <> NEW_VALUE
如果您希望旧值和新值在一行中,则使用 lag()
:
select t.*,
lag(some_column) over (partition by id order by from_date) as prev_val
from t;
如果值可能不会改变(如您的示例查询所建议的):
select t.*
from (select t.*,
lag(some_column) over (partition by id order by from_date) as prev_val
from t
) t
where prev_val <> some_column;
我认为这就是您所说的 LAG() 方法。
SELECT *
FROM (
SELECT ID
N.SOME_COLUMN NEW_VALUE,
N.FROM_DATE,
lag(N.SOME_COLUMN) over (partition by N.ID order by FROM_DATE) OLD_VALUE,
lag(N.TO_DATE) over (partition by N.ID order by FROM_DATE) OLD_TO_DATE,
FROM OUR_TABLE N
) T
WHERE FROM_DATE - 1 = OLD_TO_DATE
AND NEW_VALUE<> OLD_VALUE;
我们有一个 Oracle SQL 查询来识别其中 table 列的值已从一条记录更改为另一条记录的记录。相关列为(ID, SOME_COLUMN, FROM_DATE, TO_DATE),其中ID不唯一,FROM_DATE和TO_DATE确定时间间隔该 ID 的特定行有效,即
(ID1, VAL1, 01/01/2016, 03/01/2016)
(ID1, VAL2, 04/01/2016, 09/01/2016)
(ID1, VAL3, 10/01/2016, 19/01/2016)
等等
我们可以使用以下自连接来实现这一点
SELECT N.ID
O.SOME_COLUMN OLD_VALUE,
N.SOME_COLUMN NEW_VALUE
FROM OUR_TABLE N, OUR_TABLE O
WHERE N.ID = O.ID
AND N.FROM_DATE - 1 = O.TO_DATE
AND N.SOME_COLUMN <> O.SOME_COLUMN
然而由于table包含1亿条记录,它非常影响性能。有没有更有效的方法来做到这一点?有人暗示了解析函数(例如 LAG),但到目前为止我们还没有找到可行的解决方案。任何想法将不胜感激
是的,您可以使用 LEAD()
获取最后一个值:
SELECT t.id,
t.some_column as OLD_VALUE,
LEAD(t.some_column) OVER(PARTITION BY t.id ORDER BY t.from_date) as NEW_VALUE
FROM YourTable t
如果您只想更改,请用另一个 select 包装它并过滤 OLD_VALUE <> NEW_VALUE
如果您希望旧值和新值在一行中,则使用 lag()
:
select t.*,
lag(some_column) over (partition by id order by from_date) as prev_val
from t;
如果值可能不会改变(如您的示例查询所建议的):
select t.*
from (select t.*,
lag(some_column) over (partition by id order by from_date) as prev_val
from t
) t
where prev_val <> some_column;
我认为这就是您所说的 LAG() 方法。
SELECT *
FROM (
SELECT ID
N.SOME_COLUMN NEW_VALUE,
N.FROM_DATE,
lag(N.SOME_COLUMN) over (partition by N.ID order by FROM_DATE) OLD_VALUE,
lag(N.TO_DATE) over (partition by N.ID order by FROM_DATE) OLD_TO_DATE,
FROM OUR_TABLE N
) T
WHERE FROM_DATE - 1 = OLD_TO_DATE
AND NEW_VALUE<> OLD_VALUE;