Postgres:根据标志更改聚合行

Postgres: Aggregate rows based on flag change

大家好,也许有人对此有所了解。我有一个格式如下的 table:

id          timestamp           status value 
82240589    2020-03-01 09:13:46 70     22.00
82240589    2020-03-01 09:13:57 70     34.00
82240589    2020-03-01 09:14:14 70     21.00
82240589    2020-03-01 09:14:22 70     47.00
82240589    2020-03-01 09:14:33 70     32.00
82240589    2020-03-01 09:14:43 83     37.00
82240589    2020-03-01 09:14:52 83     44.00
82240589    2020-03-01 09:15:01 83     39.00
82240589    2020-03-01 09:15:10 70     40.00
82240589    2020-03-01 09:15:19 70     40.00
82240589    2020-03-01 09:16:30 70      5.00
82240589    2020-03-01 09:16:37 70     43.00
82240589    2020-03-01 09:16:46 70     46.00
82240589    2020-03-01 09:16:53 70     53.00
82240589    2020-03-01 09:17:00 70     55.00
82240589    2020-03-01 09:17:08 70     50.00
82240589    2020-03-01 09:17:16 70     46.00
82240589    2020-03-01 09:17:52 70     10.00

我需要根据 id 和状态变化聚合输出。此外,我需要计算该期间所有值的总和。 因此,例如输出如下所示:

id          timestamp_start         timestamp_end               status sum_value
82240589    2020-03-01 09:13:46     2020-03-01 09:14:33         70     ####
82240589    2020-03-01 09:14:43     2020-03-01 09:15:01         83     ####
82240589    2020-03-01 09:15:10     2020-03-01 09:17:52         70     ####

这是一个 问题。

select id, 
       min("timestamp") as start_at, 
       max("timestamp") as end_at,
       status,
       sum(value)
from ( 
  select id, "timestamp", status, value, 
         group_flag, 
         sum(group_flag) over (order by "timestamp") as group_nr
  from (
    select *, 
           case 
             when lag(status,1,status) over (partition by id order by "timestamp") = status then 0
             else 1
           end as group_flag
    from data
    order by id, "timestamp"
  ) t1
) t2
group by group_nr, status, id
order by id, start_at

因此,最内层的查询会创建一个标志,只要状态发生变化(对于相同的 id 值),该标志就会从 0 翻转到 1。

对于给定的数据,其结果是:

id       | timestamp           | status | value | group_flag
---------+---------------------+--------+-------+-----------
82240589 | 2020-03-01 09:13:46 |     70 | 22.00 |          0
82240589 | 2020-03-01 09:13:57 |     70 | 34.00 |          0
82240589 | 2020-03-01 09:14:14 |     70 | 21.00 |          0
82240589 | 2020-03-01 09:14:22 |     70 | 47.00 |          0
82240589 | 2020-03-01 09:14:33 |     70 | 32.00 |          0
82240589 | 2020-03-01 09:14:43 |     83 | 37.00 |          1
82240589 | 2020-03-01 09:14:52 |     83 | 44.00 |          0
82240589 | 2020-03-01 09:15:01 |     83 | 39.00 |          0
82240589 | 2020-03-01 09:15:10 |     70 | 40.00 |          1
82240589 | 2020-03-01 09:15:19 |     70 | 40.00 |          0
82240589 | 2020-03-01 09:16:30 |     70 |  5.00 |          0
82240589 | 2020-03-01 09:16:37 |     70 | 43.00 |          0
82240589 | 2020-03-01 09:16:46 |     70 | 46.00 |          0
82240589 | 2020-03-01 09:16:53 |     70 | 53.00 |          0
82240589 | 2020-03-01 09:17:00 |     70 | 55.00 |          0
82240589 | 2020-03-01 09:17:08 |     70 | 50.00 |          0
82240589 | 2020-03-01 09:17:16 |     70 | 46.00 |          0
82240589 | 2020-03-01 09:17:52 |     70 | 10.00 |          0

下一级然后根据该标志创建组。对于给定的数据,结果是:

id       | timestamp           | status | value | group_nr
---------+---------------------+--------+-------+---------
82240589 | 2020-03-01 09:13:46 |     70 | 22.00 |        0
82240589 | 2020-03-01 09:13:57 |     70 | 34.00 |        0
82240589 | 2020-03-01 09:14:14 |     70 | 21.00 |        0
82240589 | 2020-03-01 09:14:22 |     70 | 47.00 |        0
82240589 | 2020-03-01 09:14:33 |     70 | 32.00 |        0
82240589 | 2020-03-01 09:14:43 |     83 | 37.00 |        1
82240589 | 2020-03-01 09:14:52 |     83 | 44.00 |        1
82240589 | 2020-03-01 09:15:01 |     83 | 39.00 |        1
82240589 | 2020-03-01 09:15:10 |     70 | 40.00 |        2
82240589 | 2020-03-01 09:15:19 |     70 | 40.00 |        2
82240589 | 2020-03-01 09:16:30 |     70 |  5.00 |        2
82240589 | 2020-03-01 09:16:37 |     70 | 43.00 |        2
82240589 | 2020-03-01 09:16:46 |     70 | 46.00 |        2
82240589 | 2020-03-01 09:16:53 |     70 | 53.00 |        2
82240589 | 2020-03-01 09:17:00 |     70 | 55.00 |        2
82240589 | 2020-03-01 09:17:08 |     70 | 50.00 |        2
82240589 | 2020-03-01 09:17:16 |     70 | 46.00 |        2
82240589 | 2020-03-01 09:17:52 |     70 | 10.00 |        2

正如我们所见,导致状态标志的不同 "groups" 现在有一个唯一的编号,可用于 grouping/aggregating,然后在最外层查询中完成。

查询的嵌套是必要的,因为您不能嵌套 window 函数调用。

Online example