根据不同的列值过滤日期行
Filter rows on dates depending on different column values
我一直在尝试根据 task_dts 的值过滤具有 code
特定值的行。
本质上,我只想要 per id 行,其中 task_dts
时间戳落在当前 code
和 code_dts
的 code_dts
时间戳之间以下 code
.
例如;对于 code
等于 'z' 的行,我只想要 task_dts
在值 'z' 和值 [=] 的 code_dts
范围内的行44=]。
对于 code
等于 'y' 的行,我只想要 task_dts
在 code_dts
范围内的行,对于值 'y' 和值 'x'等
我的 table 看起来如下:
rowid
id
code
code_dts
task
task_dts
1
a
z
2022-02-01 10:17:08.403000
1
2022-02-01 10:21:27.000000
2
a
z
2022-02-01 10:17:08.403000
2
2022-02-01 10:21:31.000000
3
a
z
2022-02-01 10:17:08.403000
3
2022-02-01 12:41:43.000000
4
a
y
2022-02-01 11:12:13.270000
1
2022-02-01 10:21:27.000000
5
a
y
2022-02-01 11:12:13.270000
3
2022-02-01 12:41:43.000000
6
a
y
2022-02-01 11:12:13.270000
8
2022-02-21 14:57:53.000000
7
a
x
2022-02-21 12:28:50.647000
6
2022-02-21 14:57:53.000000
8
a
x
2022-02-21 12:28:50.647000
7
2022-02-21 14:57:54.000000
9
b
h
2022-04-05 13:44:16.030000
1
2022-04-05 14:03:56.570000
10
b
h
2022-04-05 13:44:16.030000
2
2022-04-05 14:03:56.570000
11
b
i
2022-04-06 13:44:16.030000
1
2022-04-05 14:03:56.570000
12
b
j
2022-04-07 13:44:16.030000
3
2022-04-05 14:03:56.570000
输出将如下所示:
rowid
id
code
code_dts
task
task_dts
1
a
z
2022-02-01 10:17:08.403000
1
2022-02-01 10:21:27.000000
2
a
z
2022-02-01 10:17:08.403000
2
2022-02-01 10:21:31.000000
5
a
y
2022-02-01 11:12:13.270000
3
2022-02-01 12:41:43.000000
7
a
x
2022-02-21 12:28:50.647000
6
2022-02-21 14:57:53.000000
8
a
x
2022-02-21 12:28:50.647000
7
2022-02-21 14:57:54.000000
10
b
h
2022-04-05 13:44:16.030000
2
2022-04-05 14:03:56.570000
11
b
i
2022-04-06 13:44:16.030000
1
2022-04-05 14:03:56.570000
12
b
j
2022-04-07 13:44:16.030000
3
2022-04-05 14:03:56.570000
我试过用 qualify 来解决这个问题,但没有成功。帮助将不胜感激
您可以使用 table 表达式来 pre-compute 时间戳范围。那么过滤就简单了。
例如:
select t.*
from t
join (
select code, dt, lead(dt) over(order by dt) as next_dt
from (select code, min(code_dts) as dt from t group by code) x
) y on t.code = y.code
where t.task_dts between y.dt and y.next_dt or y.next_dt is null
看了The Impaler的回答我终于明白你的要求了:-)
这与基于Window函数的相同逻辑:
with cte as
(
select t.*
-- next code_dts, i.e. at least one row will return
-- the code_dts of the following code
,lead(code_dts,1,task_dts) over (order by code_dts) as next_dts
from tab as t
)
select *
from cte
qualify task_dts between code_dts
-- assign the next code's dts to all rows within the same code
and max(next_dts) over (partition by code)
;
很难说哪个性能更好...
我一直在尝试根据 task_dts 的值过滤具有 code
特定值的行。
本质上,我只想要 per id 行,其中 task_dts
时间戳落在当前 code
和 code_dts
的 code_dts
时间戳之间以下 code
.
例如;对于 code
等于 'z' 的行,我只想要 task_dts
在值 'z' 和值 [=] 的 code_dts
范围内的行44=]。
对于 code
等于 'y' 的行,我只想要 task_dts
在 code_dts
范围内的行,对于值 'y' 和值 'x'等
我的 table 看起来如下:
rowid | id | code | code_dts | task | task_dts |
---|---|---|---|---|---|
1 | a | z | 2022-02-01 10:17:08.403000 | 1 | 2022-02-01 10:21:27.000000 |
2 | a | z | 2022-02-01 10:17:08.403000 | 2 | 2022-02-01 10:21:31.000000 |
3 | a | z | 2022-02-01 10:17:08.403000 | 3 | 2022-02-01 12:41:43.000000 |
4 | a | y | 2022-02-01 11:12:13.270000 | 1 | 2022-02-01 10:21:27.000000 |
5 | a | y | 2022-02-01 11:12:13.270000 | 3 | 2022-02-01 12:41:43.000000 |
6 | a | y | 2022-02-01 11:12:13.270000 | 8 | 2022-02-21 14:57:53.000000 |
7 | a | x | 2022-02-21 12:28:50.647000 | 6 | 2022-02-21 14:57:53.000000 |
8 | a | x | 2022-02-21 12:28:50.647000 | 7 | 2022-02-21 14:57:54.000000 |
9 | b | h | 2022-04-05 13:44:16.030000 | 1 | 2022-04-05 14:03:56.570000 |
10 | b | h | 2022-04-05 13:44:16.030000 | 2 | 2022-04-05 14:03:56.570000 |
11 | b | i | 2022-04-06 13:44:16.030000 | 1 | 2022-04-05 14:03:56.570000 |
12 | b | j | 2022-04-07 13:44:16.030000 | 3 | 2022-04-05 14:03:56.570000 |
输出将如下所示:
rowid | id | code | code_dts | task | task_dts |
---|---|---|---|---|---|
1 | a | z | 2022-02-01 10:17:08.403000 | 1 | 2022-02-01 10:21:27.000000 |
2 | a | z | 2022-02-01 10:17:08.403000 | 2 | 2022-02-01 10:21:31.000000 |
5 | a | y | 2022-02-01 11:12:13.270000 | 3 | 2022-02-01 12:41:43.000000 |
7 | a | x | 2022-02-21 12:28:50.647000 | 6 | 2022-02-21 14:57:53.000000 |
8 | a | x | 2022-02-21 12:28:50.647000 | 7 | 2022-02-21 14:57:54.000000 |
10 | b | h | 2022-04-05 13:44:16.030000 | 2 | 2022-04-05 14:03:56.570000 |
11 | b | i | 2022-04-06 13:44:16.030000 | 1 | 2022-04-05 14:03:56.570000 |
12 | b | j | 2022-04-07 13:44:16.030000 | 3 | 2022-04-05 14:03:56.570000 |
我试过用 qualify 来解决这个问题,但没有成功。帮助将不胜感激
您可以使用 table 表达式来 pre-compute 时间戳范围。那么过滤就简单了。
例如:
select t.*
from t
join (
select code, dt, lead(dt) over(order by dt) as next_dt
from (select code, min(code_dts) as dt from t group by code) x
) y on t.code = y.code
where t.task_dts between y.dt and y.next_dt or y.next_dt is null
看了The Impaler的回答我终于明白你的要求了:-)
这与基于Window函数的相同逻辑:
with cte as
(
select t.*
-- next code_dts, i.e. at least one row will return
-- the code_dts of the following code
,lead(code_dts,1,task_dts) over (order by code_dts) as next_dts
from tab as t
)
select *
from cte
qualify task_dts between code_dts
-- assign the next code's dts to all rows within the same code
and max(next_dts) over (partition by code)
;
很难说哪个性能更好...