选择一个部分的第一个和最后一个时间戳
Selecting first and last time stamps of a section
我有一个 MySQL 数据库,其中 table 如:
CREATE TABLE example (Batch_Num int, Time_Stamp datetime);
INSERT INTO example VALUES
(1, '2020-12-10 16:37:43'),
(1, '2020-12-11 09:47:31'),
(1, '2020-12-11 14:02:17'),
(1, '2020-12-11 15:28:02'),
(2, '2020-12-12 15:08:52'),
(2, '2020-12-14 10:38:02'),
(2, '2020-12-14 16:22:35'),
(2, '2020-12-15 08:44:13'),
(3, '2020-12-16 11:38:05'),
(3, '2020-12-17 10:19:13'),
(3, '2020-12-17 14:45:28');
+-----------+-----------------------+
| Batch_Num | Time_Stamp |
+-----------+-----------------------+
| 1 | '2020-12-10 16:37:43' |
| 1 | '2020-12-11 09:47:31' |
| 1 | '2020-12-11 14:02:17' |
| 1 | '2020-12-11 15:28:02' |
| 2 | '2020-12-12 15:08:52' |
| 2 | '2020-12-14 10:38:02' |
| 2 | '2020-12-14 16:22:35' |
| 2 | '2020-12-15 08:44:13' |
| 3 | '2020-12-16 11:38:05' |
| 3 | '2020-12-17 10:19:13' |
| 3 | '2020-12-17 14:45:28' |
+-----------+-----------------------+
我想从这个 table select 每个 Batch_Number 的每个值的第一个和最后一个时间戳。我希望 table 看起来像:
+-----------+-----------------------+-----------------------+
| Batch_Num | Beginning_Time_Stamp | End_Time_Stamp |
+-----------+-----------------------+-----------------------+
| 1 | '2020-12-10 16:37:43' | '2020-12-11 15:28:02' |
| 2 | '2020-12-12 15:08:52' | '2020-12-15 08:44:13' |
| 3 | '2020-12-16 11:38:05' | '2020-12-17 14:45:28' |
+-----------+-----------------------+-----------------------+
我不确定如何select,当前一个 Batch_Num 与当前的不同,以及下一个不同时。
一个基本的 GROUP BY
查询应该可以在这里工作:
SELECT
Batch_Num,
MIN(Time_Stamp) AS Beginning_Time_Stamp,
MAX(Time_Stamp) AS End_Time_Stamp
FROM example
GROUP BY
Batch_Num
ORDER BY
Batch_Num;
如果同一个批号可能出现在不同的系列中,那么单靠聚合是解决不了问题的。您通常会使用一些间隙和孤岛技术来解决这个问题;在这里,一种简单的方法是使用行号之间的差异来识别相邻记录组(岛):
select batch_num,
min(time_stamp) as start_time_stamp,
max(time_stamp) as end_time_stamp,
count(*) as cnt
from (
select e.*,
row_number() over(order by time_stamp) as rn1,
row_number() over(partition by batch_num order by time_stamp) as rn2
from example e
) t
group by batch_num, rn1 - rn2
order by start_time_stamp
这里是a demo。我在数据集的末尾添加了一个新的批处理 1
:
batch_num | start_time_stamp | end_time_stamp | cnt
--------: | :------------------ | :------------------ | --:
1 | 2020-12-10 16:37:43 | 2020-12-11 15:28:02 | 4
2 | 2020-12-12 15:08:52 | 2020-12-15 08:44:13 | 4
3 | 2020-12-16 11:38:05 | 2020-12-17 14:45:28 | 3
1 | 2020-12-18 14:02:17 | 2020-12-18 15:28:02 | 2
我有一个 MySQL 数据库,其中 table 如:
CREATE TABLE example (Batch_Num int, Time_Stamp datetime);
INSERT INTO example VALUES
(1, '2020-12-10 16:37:43'),
(1, '2020-12-11 09:47:31'),
(1, '2020-12-11 14:02:17'),
(1, '2020-12-11 15:28:02'),
(2, '2020-12-12 15:08:52'),
(2, '2020-12-14 10:38:02'),
(2, '2020-12-14 16:22:35'),
(2, '2020-12-15 08:44:13'),
(3, '2020-12-16 11:38:05'),
(3, '2020-12-17 10:19:13'),
(3, '2020-12-17 14:45:28');
+-----------+-----------------------+
| Batch_Num | Time_Stamp |
+-----------+-----------------------+
| 1 | '2020-12-10 16:37:43' |
| 1 | '2020-12-11 09:47:31' |
| 1 | '2020-12-11 14:02:17' |
| 1 | '2020-12-11 15:28:02' |
| 2 | '2020-12-12 15:08:52' |
| 2 | '2020-12-14 10:38:02' |
| 2 | '2020-12-14 16:22:35' |
| 2 | '2020-12-15 08:44:13' |
| 3 | '2020-12-16 11:38:05' |
| 3 | '2020-12-17 10:19:13' |
| 3 | '2020-12-17 14:45:28' |
+-----------+-----------------------+
我想从这个 table select 每个 Batch_Number 的每个值的第一个和最后一个时间戳。我希望 table 看起来像:
+-----------+-----------------------+-----------------------+
| Batch_Num | Beginning_Time_Stamp | End_Time_Stamp |
+-----------+-----------------------+-----------------------+
| 1 | '2020-12-10 16:37:43' | '2020-12-11 15:28:02' |
| 2 | '2020-12-12 15:08:52' | '2020-12-15 08:44:13' |
| 3 | '2020-12-16 11:38:05' | '2020-12-17 14:45:28' |
+-----------+-----------------------+-----------------------+
我不确定如何select,当前一个 Batch_Num 与当前的不同,以及下一个不同时。
一个基本的 GROUP BY
查询应该可以在这里工作:
SELECT
Batch_Num,
MIN(Time_Stamp) AS Beginning_Time_Stamp,
MAX(Time_Stamp) AS End_Time_Stamp
FROM example
GROUP BY
Batch_Num
ORDER BY
Batch_Num;
如果同一个批号可能出现在不同的系列中,那么单靠聚合是解决不了问题的。您通常会使用一些间隙和孤岛技术来解决这个问题;在这里,一种简单的方法是使用行号之间的差异来识别相邻记录组(岛):
select batch_num,
min(time_stamp) as start_time_stamp,
max(time_stamp) as end_time_stamp,
count(*) as cnt
from (
select e.*,
row_number() over(order by time_stamp) as rn1,
row_number() over(partition by batch_num order by time_stamp) as rn2
from example e
) t
group by batch_num, rn1 - rn2
order by start_time_stamp
这里是a demo。我在数据集的末尾添加了一个新的批处理 1
:
batch_num | start_time_stamp | end_time_stamp | cnt --------: | :------------------ | :------------------ | --: 1 | 2020-12-10 16:37:43 | 2020-12-11 15:28:02 | 4 2 | 2020-12-12 15:08:52 | 2020-12-15 08:44:13 | 4 3 | 2020-12-16 11:38:05 | 2020-12-17 14:45:28 | 3 1 | 2020-12-18 14:02:17 | 2020-12-18 15:28:02 | 2