限制oracle中over语句内的数据
limit data within an over statement in oracle
我想根据时间戳聚合列。
举个例子:
Table 包含 col1、col2、...、col_ts(时间戳列)等列。
SELECT
SUM(col1) OVER (ORDER BY col_ts ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) SUM1,
SUM(col2) OVER (ORDER BY col_ts ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) SUM2
FROM ...
现在,当时间戳之间的差异 <= 5 分钟时,我只需要 2 PRECEDING 和 2 FOLLOWING ROWS SUMMED。
例如,让我们看一下这些时间戳值:
14.09.15 15:44:00
14.09.15 15:50:00
14.09.15 15:51:00
14.09.15 15:52:00
14.09.15 15:53:00
当我们位于时间戳值“14.09.15 15:51:00”的行时,我希望 SUM OVER 从 15:50 到 15:53 的值,因为两者之间的差异15:50 和 15:44 超过 5 分钟。
有没有办法在over子句中写这样的条件?
或者有人对此有好的、高效的解决方案吗?
我认为 sql 太多了。您可以限制 window 中的数量或元素,您可以以某种方式(见下文)限制值,但不能同时限制两者。
drop table fg_test;
create table fg_test(col_ts timestamp, n number);
insert into fg_test values (systimestamp, 1);
insert into fg_test values (systimestamp+4/1440/60, 2);
insert into fg_test values (systimestamp+5/1440/60, 3);
insert into fg_test values (systimestamp+6/1440/60, 4);
insert into fg_test values (systimestamp+7/1440/60, 5);
insert into fg_test values (systimestamp+13/1440/60, 6);
select col_ts, n,
SUM(n) OVER (ORDER BY col_ts ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) SUM1,
SUM(n) OVER (ORDER BY col_ts RANGE BETWEEN current row AND interval '5' second FOLLOWING) SUMNEW
from fg_test;
结果:
COL_TS N SUM1 SUM2
------------------------------- ---------- ---------- ----------
14-SEP-15 06.16.28.825395000 PM 1 3 3
14-SEP-15 06.16.33.000000000 PM 2 6 14
14-SEP-15 06.16.34.000000000 PM 3 9 12
14-SEP-15 06.16.35.000000000 PM 4 12 9
14-SEP-15 06.16.36.000000000 PM 5 15 5
14-SEP-15 06.16.42.000000000 PM 6 11 6
(很抱歉没有按照你的问题举出确切的例子)
另一种方法是写一些PL/SQL(打开游标并进行一些处理)。
好的,我看到这里的问题了。谢谢弗罗林。那么一些预处理呢?我可以找到解决方案,但我不确定是否有更快的解决方案:
select col_ts,
n,
SUM(n) OVER (ORDER BY col_ts ROWS BETWEEN LEFT_VALUE PRECEDING AND RIGHT_VALUE FOLLOWING) MY_SUM,
SUM(n) OVER (ORDER BY col_ts RANGE BETWEEN interval '5' second PRECEDING AND interval '5' second FOLLOWING) OLD_SUM
from (
select col_ts,
n,
CASE
WHEN (LEAD(col_ts,1) OVER (ORDER BY col_ts ) - col_ts) <= INTERVAL '5' second
THEN
CASE
WHEN (LEAD(col_ts,2) OVER (ORDER BY col_ts ) - LEAD(col_ts,1) OVER (ORDER BY col_ts )) <= INTERVAL '5' second
THEN 2
ELSE 1
END
ELSE 0
END AS RIGHT_VALUE,
CASE
WHEN (col_ts - LAG(col_ts,1) OVER (ORDER BY col_ts ) ) <= INTERVAL '5' second
THEN
CASE
WHEN (LAG(col_ts,1) OVER (ORDER BY col_ts ) - LAG(col_ts,2) OVER (ORDER BY col_ts )) <= INTERVAL '5' second
THEN 2
ELSE 1
END
ELSE 0
END AS LEFT_VALUE
from fg_test
);
结果:
COL_TS N MY_SUM OLD_SUM
--------------------------- ----- ------- -----------
15.09.15 09:34:24,069000000 1 6 6
15.09.15 09:34:28,000000000 2 10 15
15.09.15 09:34:29,000000000 3 15 15
15.09.15 09:34:30,000000000 4 14 14
15.09.15 09:34:31,000000000 5 12 14
15.09.15 09:34:37,000000000 6 6 6
你怎么看?
我想根据时间戳聚合列。
举个例子:
Table 包含 col1、col2、...、col_ts(时间戳列)等列。
SELECT
SUM(col1) OVER (ORDER BY col_ts ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) SUM1,
SUM(col2) OVER (ORDER BY col_ts ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) SUM2
FROM ...
现在,当时间戳之间的差异 <= 5 分钟时,我只需要 2 PRECEDING 和 2 FOLLOWING ROWS SUMMED。
例如,让我们看一下这些时间戳值:
14.09.15 15:44:00
14.09.15 15:50:00
14.09.15 15:51:00
14.09.15 15:52:00
14.09.15 15:53:00
当我们位于时间戳值“14.09.15 15:51:00”的行时,我希望 SUM OVER 从 15:50 到 15:53 的值,因为两者之间的差异15:50 和 15:44 超过 5 分钟。
有没有办法在over子句中写这样的条件?
或者有人对此有好的、高效的解决方案吗?
我认为 sql 太多了。您可以限制 window 中的数量或元素,您可以以某种方式(见下文)限制值,但不能同时限制两者。
drop table fg_test;
create table fg_test(col_ts timestamp, n number);
insert into fg_test values (systimestamp, 1);
insert into fg_test values (systimestamp+4/1440/60, 2);
insert into fg_test values (systimestamp+5/1440/60, 3);
insert into fg_test values (systimestamp+6/1440/60, 4);
insert into fg_test values (systimestamp+7/1440/60, 5);
insert into fg_test values (systimestamp+13/1440/60, 6);
select col_ts, n,
SUM(n) OVER (ORDER BY col_ts ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) SUM1,
SUM(n) OVER (ORDER BY col_ts RANGE BETWEEN current row AND interval '5' second FOLLOWING) SUMNEW
from fg_test;
结果:
COL_TS N SUM1 SUM2
------------------------------- ---------- ---------- ----------
14-SEP-15 06.16.28.825395000 PM 1 3 3
14-SEP-15 06.16.33.000000000 PM 2 6 14
14-SEP-15 06.16.34.000000000 PM 3 9 12
14-SEP-15 06.16.35.000000000 PM 4 12 9
14-SEP-15 06.16.36.000000000 PM 5 15 5
14-SEP-15 06.16.42.000000000 PM 6 11 6
(很抱歉没有按照你的问题举出确切的例子)
另一种方法是写一些PL/SQL(打开游标并进行一些处理)。
好的,我看到这里的问题了。谢谢弗罗林。那么一些预处理呢?我可以找到解决方案,但我不确定是否有更快的解决方案:
select col_ts,
n,
SUM(n) OVER (ORDER BY col_ts ROWS BETWEEN LEFT_VALUE PRECEDING AND RIGHT_VALUE FOLLOWING) MY_SUM,
SUM(n) OVER (ORDER BY col_ts RANGE BETWEEN interval '5' second PRECEDING AND interval '5' second FOLLOWING) OLD_SUM
from (
select col_ts,
n,
CASE
WHEN (LEAD(col_ts,1) OVER (ORDER BY col_ts ) - col_ts) <= INTERVAL '5' second
THEN
CASE
WHEN (LEAD(col_ts,2) OVER (ORDER BY col_ts ) - LEAD(col_ts,1) OVER (ORDER BY col_ts )) <= INTERVAL '5' second
THEN 2
ELSE 1
END
ELSE 0
END AS RIGHT_VALUE,
CASE
WHEN (col_ts - LAG(col_ts,1) OVER (ORDER BY col_ts ) ) <= INTERVAL '5' second
THEN
CASE
WHEN (LAG(col_ts,1) OVER (ORDER BY col_ts ) - LAG(col_ts,2) OVER (ORDER BY col_ts )) <= INTERVAL '5' second
THEN 2
ELSE 1
END
ELSE 0
END AS LEFT_VALUE
from fg_test
);
结果:
COL_TS N MY_SUM OLD_SUM
--------------------------- ----- ------- -----------
15.09.15 09:34:24,069000000 1 6 6
15.09.15 09:34:28,000000000 2 10 15
15.09.15 09:34:29,000000000 3 15 15
15.09.15 09:34:30,000000000 4 14 14
15.09.15 09:34:31,000000000 5 12 14
15.09.15 09:34:37,000000000 6 6 6
你怎么看?