在 sql 中对连续时间间隔进行分组

Question

假设数据结构为

stock_name, action, start_date, end_date
google, growing, 1, 2
google, growing, 2, 3
google, falling, 3, 4
google, growing, 4, 5
yahoo, growing, 1, 2

如何聚合它以合并连续的时间间隔？

输出如下：

stock_name, action, start_date, end_date
google, growing, 1, 3
google, falling, 3, 4
google, growing, 4, 5
yahoo, growing, 1, 2

我想过使用 rank window 函数用一个常量对连续数进行编号，然后根据它和 action/name 进行分组，但我无法完全实现它，如下所示：

stock_name, action, start_date, end_date, rank
google, growing, 1, 2, 1
google, growing, 2, 3, 1
google, falling, 3, 4, 1
google, growing, 4, 5, 2
yahoo, growing, 1, 2, 1

如果是Mysql，我会很容易地用变量解决它，但这在 postgres 中是不可能的。

可以有任意数量的连续间隔，因此不能自行加入预定次数。

解决方案的优雅（性能、可读性）很重要。

Answer 1

您可以在 PL/pgSQL 中很好地使用变量。

我会用 table 函数解决这个问题。

假设 table 被称为 stock，我的代码将如下所示：

CREATE OR REPLACE FUNCTION combine_periods() RETURNS SETOF stock
   LANGUAGE plpgsql STABLE AS
$$DECLARE
   s stock;
   period stock;
BEGIN
   FOR s IN
      SELECT stock_name, action, start_date, end_date
      FROM stock
      ORDER BY stock_name, action, start_date
   LOOP
      /* is this a new period? */
      IF period IS NOT NULL AND
         (period.stock_name <> s.stock_name
            OR period.action <> s.action
            OR period.end_date <> s.start_date)
      THEN
         /* new period, output last period */
         RETURN NEXT period;
         period := NULL;
      ELSE
         IF period IS NOT NULL
         THEN
            /* period continues, update end_date */
            period.end_date := s.end_date;
         END IF;
      END IF;

      /* remember the beginning of a new period */
      IF period IS NULL
      THEN
         period := s;
      END IF;
   END LOOP;

   /* output the last period */
   IF period IS NOT NULL
   THEN
      RETURN NEXT period;
   END IF;

   RETURN;
END;$$;

我会这样称呼它：

test=> SELECT * FROM combine_periods();
┌────────────┬─────────┬────────────┬──────────┐
│ stock_name │ action  │ start_date │ end_date │
├────────────┼─────────┼────────────┼──────────┤
│ google     │ falling │          3 │        4 │
│ google     │ growing │          1 │        3 │
│ google     │ growing │          4 │        5 │
│ yahoo      │ growing │          1 │        2 │
└────────────┴─────────┴────────────┴──────────┘
(4 rows)

在 sql 中对连续时间间隔进行分组

group consecutive time intervals in sql

sql

postgresql

gaps-and-islands