选择一个部分的第一个和最后一个时间戳

Selecting first and last time stamps of a section

我有一个 MySQL 数据库,其中 table 如:

CREATE TABLE example (Batch_Num int, Time_Stamp datetime);
INSERT INTO example VALUES
    (1, '2020-12-10 16:37:43'),
    (1, '2020-12-11 09:47:31'),
    (1, '2020-12-11 14:02:17'),
    (1, '2020-12-11 15:28:02'),
    (2, '2020-12-12 15:08:52'),
    (2, '2020-12-14 10:38:02'),
    (2, '2020-12-14 16:22:35'),
    (2, '2020-12-15 08:44:13'),
    (3, '2020-12-16 11:38:05'),
    (3, '2020-12-17 10:19:13'),
    (3, '2020-12-17 14:45:28');

+-----------+-----------------------+
| Batch_Num |      Time_Stamp       |
+-----------+-----------------------+
|         1 | '2020-12-10 16:37:43' |
|         1 | '2020-12-11 09:47:31' |
|         1 | '2020-12-11 14:02:17' |
|         1 | '2020-12-11 15:28:02' |
|         2 | '2020-12-12 15:08:52' |
|         2 | '2020-12-14 10:38:02' |
|         2 | '2020-12-14 16:22:35' |
|         2 | '2020-12-15 08:44:13' |
|         3 | '2020-12-16 11:38:05' |
|         3 | '2020-12-17 10:19:13' |
|         3 | '2020-12-17 14:45:28' |
+-----------+-----------------------+

我想从这个 table select 每个 Batch_Number 的每个值的第一个和最后一个时间戳。我希望 table 看起来像:

+-----------+-----------------------+-----------------------+
| Batch_Num | Beginning_Time_Stamp  |    End_Time_Stamp     |
+-----------+-----------------------+-----------------------+
|         1 | '2020-12-10 16:37:43' | '2020-12-11 15:28:02' |
|         2 | '2020-12-12 15:08:52' | '2020-12-15 08:44:13' |
|         3 | '2020-12-16 11:38:05' | '2020-12-17 14:45:28' |
+-----------+-----------------------+-----------------------+

我不确定如何select,当前一个 Batch_Num 与当前的不同,以及下一个不同时。

一个基本的 GROUP BY 查询应该可以在这里工作:

SELECT
    Batch_Num,
    MIN(Time_Stamp) AS Beginning_Time_Stamp,
    MAX(Time_Stamp) AS End_Time_Stamp
FROM example
GROUP BY
    Batch_Num
ORDER BY
    Batch_Num;

Demo

如果同一个批号可能出现在不同的系列中,那么单靠聚合是解决不了问题的。您通常会使用一些间隙和孤岛技术来解决这个问题;在这里,一种简单的方法是使用行号之间的差异来识别相邻记录组(岛):

select batch_num, 
    min(time_stamp) as start_time_stamp, 
    max(time_stamp) as end_time_stamp,
    count(*) as cnt
from (
    select e.*,
        row_number() over(order by time_stamp) as rn1,
        row_number() over(partition by batch_num order by time_stamp) as rn2
    from example e
) t
group by batch_num, rn1 - rn2
order by start_time_stamp

这里是a demo。我在数据集的末尾添加了一个新的批处理 1

batch_num | start_time_stamp    | end_time_stamp      | cnt
--------: | :------------------ | :------------------ | --:
        1 | 2020-12-10 16:37:43 | 2020-12-11 15:28:02 |   4
        2 | 2020-12-12 15:08:52 | 2020-12-15 08:44:13 |   4
        3 | 2020-12-16 11:38:05 | 2020-12-17 14:45:28 |   3
        1 | 2020-12-18 14:02:17 | 2020-12-18 15:28:02 |   2