SQL 查找 window 中一系列更改的总天数
SQL to find sum of total days in a window for a series of changes
以下是table:
start_date
recorded_date
id
2021-11-10
2021-11-01
1a
2021-11-08
2021-11-02
1a
2021-11-11
2021-11-03
1a
2021-11-10
2021-11-04
1a
2021-11-10
2021-11-05
1a
我需要一个查询来查找给定 ID 的总天数变化。在这种情况下,它从 11 月 10 日变为 11 月 8 日,所以 2 天,然后又从 11 月 8 日到 11 日,所以 3 天,然后又从 11 日到 10 日,一天,最后从 10 日到 10 日,即 0 天。
id - '1a' 总共有 2+3+1+0 = 6 天的变化。
基本上每个变化都有一个recorded_date,所以我们按升序排列,然后计算按id分组的天数的总变化。最终结果应该是这样的:
id
Agg_Change
1a
6
有没有办法使用 SQL 来做到这一点。我正在使用 vertica 数据库。
谢谢。
您可以使用 window 函数 lead 来获取行之间的差异,然后按 id 进行分组
select id, sum(daydiff) Agg_Change
from (
select id, abs(datediff(day, start_Date, lead(start_date,1,start_date) over (partition by id order by recorded_date))) as daydiff
from tablename
) t group by id
我以为滞后函数会为我提供答案,但它一直给我错误的答案,因为我在一个地方有错误的逻辑。我有我需要的答案:
with cte as(
select id, start_date, recorded_date,
row_number() over(partition by id order by recorded_date asc) as idrank,
lag(start_date,1) over(partition by id order by recorded_date asc) as prev
from table_temp
)
select id, sum(abs(date(start_date) - date(prev))) as Agg_Change
from cte
group by 1
如果有人有更好的解决方案请告诉我。
确实是在OLAP查询中使用LAG()
获取前一个日期,外层查询获取绝对日期差,和它的总和,按id分组:
WITH
-- your input - don't use in real query ...
indata(start_date,recorded_date,id) AS (
SELECT DATE '2021-11-10',DATE '2021-11-01','1a'
UNION ALL SELECT DATE '2021-11-08',DATE '2021-11-02','1a'
UNION ALL SELECT DATE '2021-11-11',DATE '2021-11-03','1a'
UNION ALL SELECT DATE '2021-11-10',DATE '2021-11-04','1a'
UNION ALL SELECT DATE '2021-11-10',DATE '2021-11-05','1a'
)
-- real query starts here, replace following comma with "WITH" ...
,
w_lag AS (
SELECT
id
, start_date
, LAG(start_date) OVER w AS prevdt
FROM indata
WINDOW w AS (PARTITION BY id ORDER BY recorded_date)
)
SELECT
id
, SUM(ABS(DATEDIFF(DAY,start_date,prevdt))) AS dtdiff
FROM w_lag
GROUP BY id
-- out id | dtdiff
-- out ----+--------
-- out 1a | 6
以下是table:
start_date | recorded_date | id |
---|---|---|
2021-11-10 | 2021-11-01 | 1a |
2021-11-08 | 2021-11-02 | 1a |
2021-11-11 | 2021-11-03 | 1a |
2021-11-10 | 2021-11-04 | 1a |
2021-11-10 | 2021-11-05 | 1a |
我需要一个查询来查找给定 ID 的总天数变化。在这种情况下,它从 11 月 10 日变为 11 月 8 日,所以 2 天,然后又从 11 月 8 日到 11 日,所以 3 天,然后又从 11 日到 10 日,一天,最后从 10 日到 10 日,即 0 天。
id - '1a' 总共有 2+3+1+0 = 6 天的变化。
基本上每个变化都有一个recorded_date,所以我们按升序排列,然后计算按id分组的天数的总变化。最终结果应该是这样的:
id | Agg_Change |
---|---|
1a | 6 |
有没有办法使用 SQL 来做到这一点。我正在使用 vertica 数据库。
谢谢。
您可以使用 window 函数 lead 来获取行之间的差异,然后按 id 进行分组
select id, sum(daydiff) Agg_Change
from (
select id, abs(datediff(day, start_Date, lead(start_date,1,start_date) over (partition by id order by recorded_date))) as daydiff
from tablename
) t group by id
我以为滞后函数会为我提供答案,但它一直给我错误的答案,因为我在一个地方有错误的逻辑。我有我需要的答案:
with cte as(
select id, start_date, recorded_date,
row_number() over(partition by id order by recorded_date asc) as idrank,
lag(start_date,1) over(partition by id order by recorded_date asc) as prev
from table_temp
)
select id, sum(abs(date(start_date) - date(prev))) as Agg_Change
from cte
group by 1
如果有人有更好的解决方案请告诉我。
确实是在OLAP查询中使用LAG()
获取前一个日期,外层查询获取绝对日期差,和它的总和,按id分组:
WITH
-- your input - don't use in real query ...
indata(start_date,recorded_date,id) AS (
SELECT DATE '2021-11-10',DATE '2021-11-01','1a'
UNION ALL SELECT DATE '2021-11-08',DATE '2021-11-02','1a'
UNION ALL SELECT DATE '2021-11-11',DATE '2021-11-03','1a'
UNION ALL SELECT DATE '2021-11-10',DATE '2021-11-04','1a'
UNION ALL SELECT DATE '2021-11-10',DATE '2021-11-05','1a'
)
-- real query starts here, replace following comma with "WITH" ...
,
w_lag AS (
SELECT
id
, start_date
, LAG(start_date) OVER w AS prevdt
FROM indata
WINDOW w AS (PARTITION BY id ORDER BY recorded_date)
)
SELECT
id
, SUM(ABS(DATEDIFF(DAY,start_date,prevdt))) AS dtdiff
FROM w_lag
GROUP BY id
-- out id | dtdiff
-- out ----+--------
-- out 1a | 6