Postgres连续天数,差距和岛屿,Tabibitosan

Postgres Consecutive Days, gaps and islands, Tabibitosan

SQL FIDDLE

我有以下数据库table:

date name
2014-08-10 bob
2014-08-10 sue
2014-08-11 bob
2014-08-11 mike
2014-08-12 bob
2014-08-12 mike
2014-08-05 bob
2014-08-06 bob
SELECT t.Name,COUNT(*) as frequency
FROM (
    SELECT Name,Date,
            row_number() OVER (
            ORDER BY Date
            ) - row_number() OVER (
            PARTITION BY Name ORDER BY Date
            ) + 1 seq
    FROM orders
    ) t
GROUP BY Name,seq;

尝试了 运行 Tabibitosan 寻找间隙和岛的方法产生了下面的 table,这是 不正确的 .由于第 11 天和第 12 天是连续的,因此名称“mike”实际上应该计数为 2。我该如何解决这个问题?

name frequency
mike 1
bob 3
bob 2
mike 1
sue 1

更正以下预期输出:

name frequency
bob 3
bob 2
mike 2
sue 1

你使用了错误的逻辑。基本上,您想要连续的日期,因此您想要从日期 :

中减去序列
SELECT t.Name, COUNT(*) as frequency
FROM (SELECT o.*,
             row_number() OVER (PARTITION BY Name ORDER BY Date) as seqnum
      FROM orders o
     ) t
GROUP BY Name, date - seqnum * interval '1 day';

Here 是一个 db<>fiddle.

在 Postgresql 中解决了 Gaps and Islands 问题:

运行 这个工作演示示例:

drop table if exists foobar; 
CREATE TABLE foobar( tick text, date_val date ); 
insert into foobar values('XYZ', '2021-01-03');  --island 1 has width 2
insert into foobar values('XYZ', '2021-01-04');  --island 1
insert into foobar values('XYZ', '2021-05-09');  --island 2 has width 3
insert into foobar values('XYZ', '2021-05-10');  --island 2 
insert into foobar values('XYZ', '2021-05-11');  --island 2
insert into foobar values('XYZ', '2021-07-07');  --island 3 has width 4
insert into foobar values('XYZ', '2021-07-08');  --island 3
insert into foobar values('XYZ', '2021-07-09');  --island 3
insert into foobar values('XYZ', '2021-07-10');  --island 3 
insert into foobar values('XYZ', '2022-10-10');  --island 4 has width 1


select tick, island_width, min_val, max_val, 
       min_val - lag(max_val) over (order by max_val) 
       as gap_width from  
( 
  select tick, count(*) as island_width, 
         min(date_val) min_val, max(date_val) max_val 
  from ( 
    select t.*, 
    row_number() over ( partition by tick order by date_val ) as seqnum 
    from foobar t where tick = 'XYZ' 
    ) t 
  group by tick, date_val - seqnum * interval '1 day' 
) t2 order by max_val desc

打印:

┌──────┬──────────────┬────────────┬────────────┬───────────┐ 
│ tick │ island_width │  min_val   │  max_val   │ gap_width │ 
├──────┼──────────────┼────────────┼────────────┼───────────┤ 
│ XYZ  │            1 │ 2022-10-10 │ 2022-10-10 │       457 │ 
│ XYZ  │            4 │ 2021-07-07 │ 2021-07-10 │        57 │ 
│ XYZ  │            3 │ 2021-05-09 │ 2021-05-11 │       125 │ 
│ XYZ  │            2 │ 2021-01-03 │ 2021-01-04 │         ¤ │ 
└──────┴──────────────┴────────────┴────────────┴───────────┘ 

island_width列给出了连续数据的宽度。 gap_width 给出缺失数据的宽度。