在 postgres 中查找日期间隔

find date gap in postgres

我有一个 table,带有用户 ID 和一些这样的日期:-

    | userId |       dates        |
    | 1      | 2021-06-20 00:00:00|
    | 1      | 2021-06-24 00:00:00|
    | 2      | 2021-06-25 00:00:00|
    | 2      | 2021-06-28 00:00:00|
    | 2      | 2021-06-30 00:00:00|
    | 3      | 2021-06-22 00:00:00|
    | 3      | 2021-06-24 00:00:00|
    | 3      | 2021-06-27 00:00:00|

我想为每个 userId 不存在的用户查找第一个日期:-

预期输出:-

    | userId |       dates        |
    | 1      | 2021-06-21 00:00:00|
    | 2      | 2021-06-26 00:00:00|
    | 3      | 2021-06-23 00:00:00|

我正在使用 postgres,有人可以帮忙吗,因为数据非常大,超过 4m。

我认为最简单的方法是 lead() 和聚合:

select userid,
       min(date) + interval '1 day'
from (select t.*,
             lead(date) over (partition by userid order by date) as next_date
      from t
     ) t
where next_date is null or next_date <> date + interval '1 day'
group by userid;

或使用distinct on:

select distinct on (userid) userid, date + interval '1 day'
from (select t.*,
             lead(date) over (partition by userid order by date) as next_date
      from t
     ) t
where next_date is null or next_date <> date + interval '1 day'
order by userid, date;

您还可以将 where 子句写为:

where next_date is distinct from date + interval '1 day'

Here 是一个 db<>fiddle.

select DISTINCT ON (userId) userId, dates from table_name order by dates