SQL 查找 window 中一系列更改的总天数

SQL to find sum of total days in a window for a series of changes

以下是table:

start_date recorded_date id
2021-11-10 2021-11-01 1a
2021-11-08 2021-11-02 1a
2021-11-11 2021-11-03 1a
2021-11-10 2021-11-04 1a
2021-11-10 2021-11-05 1a

我需要一个查询来查找给定 ID 的总天数变化。在这种情况下,它从 11 月 10 日变为 11 月 8 日,所以 2 天,然后又从 11 月 8 日到 11 日,所以 3 天,然后又从 11 日到 10 日,一天,最后从 10 日到 10 日,即 0 天。

id - '1a' 总共有 2+3+1+0 = 6 天的变化。

基本上每个变化都有一个recorded_date,所以我们按升序排列,然后计算按id分组的天数的总变化。最终结果应该是这样的:

id Agg_Change
1a 6

有没有办法使用 SQL 来做到这一点。我正在使用 vertica 数据库。

谢谢。

您可以使用 window 函数 lead 来获取行之间的差异,然后按 id 进行分组

select id, sum(daydiff) Agg_Change
from (
select id, abs(datediff(day, start_Date, lead(start_date,1,start_date) over (partition by id order by recorded_date))) as daydiff
from tablename
) t group by id 

我以为滞后函数会为我提供答案,但它一直给我错误的答案,因为我在一个地方有错误的逻辑。我有我需要的答案:

with cte as(
select id, start_date, recorded_date,
row_number() over(partition by id order by recorded_date asc) as idrank,
lag(start_date,1) over(partition by id order by recorded_date asc) as prev
from table_temp
)
select id, sum(abs(date(start_date) - date(prev))) as Agg_Change
from cte
group by 1

如果有人有更好的解决方案请告诉我。

确实是在OLAP查询中使用LAG()获取前一个日期,外层查询获取绝对日期差,和它的总和,按id分组:

WITH
-- your input - don't use in real query ...
indata(start_date,recorded_date,id) AS (
          SELECT DATE '2021-11-10',DATE '2021-11-01','1a'
UNION ALL SELECT DATE '2021-11-08',DATE '2021-11-02','1a'
UNION ALL SELECT DATE '2021-11-11',DATE '2021-11-03','1a'
UNION ALL SELECT DATE '2021-11-10',DATE '2021-11-04','1a'
UNION ALL SELECT DATE '2021-11-10',DATE '2021-11-05','1a'
)
-- real query starts here, replace following comma with "WITH" ...
,
w_lag AS (
  SELECT
    id
  , start_date
  , LAG(start_date) OVER w AS prevdt
FROM indata
WINDOW w AS (PARTITION BY id ORDER BY recorded_date)
)
SELECT
  id
, SUM(ABS(DATEDIFF(DAY,start_date,prevdt))) AS dtdiff
FROM w_lag
GROUP BY id
-- out  id | dtdiff 
-- out ----+--------
-- out  1a |      6