查找连续值的范围
Finding ranges of continous values
带有 MySQL table 和时间戳。
|Timestamps |
|-----------------------|
|2021-08-01 14:00:00.000|
|2021-08-01 14:00:00.100|
|2021-08-01 14:00:00.200|
|2021-08-01 14:00:00.300|
|2021-08-01 14:00:00.600|
|2021-08-01 14:00:00.700|
|2021-08-01 14:00:00.800|
|2021-08-01 14:00:01.000|
我想获取连续数据的时间间隔。连续由频率定义,在这个例子中,它可以是 10Hz
期望的结果是
|Start | End |
|------------------------|------------------------|
|2021-08-01 14:00:00.000 | 2021-08-01 14:00:00.300|
|2021-08-01 14:00:00.600 | 2021-08-01 14:00:00.800|
我正在使用 MySQL 版本 5.7.35
并且不能使用 WITH 和其他功能。这可以快速完成吗?
有大约。现在每个 table 100000 个元素。
使用:
CREATE TABLE test_tbl (
my_data timestamp(3)
);
INSERT INTO test_tbl VALUES
('2021-08-01 14:00:00.000'),
('2021-08-01 14:00:00.100'),
('2021-08-01 14:00:00.200'),
('2021-08-01 14:00:00.300'),
('2021-08-01 14:00:00.600'),
('2021-08-01 14:00:00.700'),
('2021-08-01 14:00:00.800'),
('2021-08-01 14:00:01.000');
SELECT
MIN(my_data) start,
MAX(my_data) end
FROM
( SELECT *
, CASE WHEN right(my_data,3) = @prev+100 THEN @i:=@i ELSE @i:=@i+100 END row_num
, @prev:=right(my_data,3) prev
FROM test_tbl
, (SELECT @prev:= null,@i:=0) vars
ORDER BY row_num
) x
GROUP BY row_num;
插入速度
经验法则——“如果每秒写入次数少于 100 次,请不要担心性能。” (你问的是每秒 10 次,对吗?)(有超过 100 次的技巧。)
非窗口解决方案(8.0 或 10.2 之前的版本)
-- 样本数据
CREATE TABLE gaps (
ts timestamp(3),
PRIMARY KEY(ts) -- Some index on ts is important here
);
INSERT INTO gaps VALUES
('2021-08-01 14:00:00.000'),
('2021-08-01 14:00:00.100'),
('2021-08-01 14:00:00.200'),
('2021-08-01 14:00:00.300'),
('2021-08-01 14:00:00.600'),
('2021-08-01 14:00:00.700'),
('2021-08-01 14:00:00.800'),
('2021-08-01 14:00:01.000'), -- By itself
('2021-08-01 14:00:01.300');
-- 添加序号(通过AUTO_INCREMENT);
警告:这取决于 auto_increment_increment = 1
DROP TEMPORARY TABLE IF EXISTS gaps2;
CREATE TEMPORARY TABLE gaps2 (
id INT AUTO_INCREMENT,
PRIMARY KEY(id) )
SELECT val
FROM (
( SELECT a.ts as val
FROM gaps AS a
LEFT JOIN gaps AS b ON a.ts = b.ts + interval 0.1 second
WHERE b.ts IS NULL
)
UNION ALL
( SELECT b.ts as val
FROM gaps AS b
LEFT JOIN gaps AS a ON b.ts = a.ts - interval 0.1 second
WHERE a.ts IS NULL
)
ORDER BY val
) AS cd;
select * from gaps2;
+----+-------------------------+
| id | val |
+----+-------------------------+
| 1 | 2021-08-01 14:00:00.000 |
| 2 | 2021-08-01 14:00:00.300 |
| 3 | 2021-08-01 14:00:00.600 |
| 4 | 2021-08-01 14:00:00.800 |
| 5 | 2021-08-01 14:00:01.000 |
| 6 | 2021-08-01 14:00:01.000 |
| 7 | 2021-08-01 14:00:01.300 |
| 8 | 2021-08-01 14:00:01.300 |
+----+-------------------------+
-- 枢轴:
SELECT CEIL(id/2) AS pair,
MIN(val) AS 'run_start',
MAX(val) AS 'run_end'
FROM gaps2
GROUP BY pair
ORDER BY pair;
+------+-------------------------+-------------------------+
| pair | run_start | run_end |
+------+-------------------------+-------------------------+
| 1 | 2021-08-01 14:00:00.000 | 2021-08-01 14:00:00.300 |
| 2 | 2021-08-01 14:00:00.600 | 2021-08-01 14:00:00.800 |
| 3 | 2021-08-01 14:00:01.000 | 2021-08-01 14:00:01.000 |
| 4 | 2021-08-01 14:00:01.300 | 2021-08-01 14:00:01.300 |
+------+-------------------------+-------------------------+
得到第一个答案后,我开始修改它,昨晚我得到了一个工作示例。
SELECT
MIN(my_data) start,
MAX(my_data) end
FROM
( SELECT *
, CASE WHEN UNIX_TIMESTAMP(my_data)*1000 >= @prev+90 and
UNIX_TIMESTAMP(my_data)*1000 <= @prev+110
THEN @i:=@i ELSE @i:=@i+100 END row_num
, @prev:=UNIX_TIMESTAMP(my_data)*1000 prev
FROM test_tbl
, (SELECT @prev:= null,@i:=0) vars
ORDER BY row_num
) x
GROUP BY row_num;
我会标记第一个答案,因为它给了灵感。但感谢您的输入。
带有 MySQL table 和时间戳。
|Timestamps |
|-----------------------|
|2021-08-01 14:00:00.000|
|2021-08-01 14:00:00.100|
|2021-08-01 14:00:00.200|
|2021-08-01 14:00:00.300|
|2021-08-01 14:00:00.600|
|2021-08-01 14:00:00.700|
|2021-08-01 14:00:00.800|
|2021-08-01 14:00:01.000|
我想获取连续数据的时间间隔。连续由频率定义,在这个例子中,它可以是 10Hz 期望的结果是
|Start | End |
|------------------------|------------------------|
|2021-08-01 14:00:00.000 | 2021-08-01 14:00:00.300|
|2021-08-01 14:00:00.600 | 2021-08-01 14:00:00.800|
我正在使用 MySQL 版本 5.7.35 并且不能使用 WITH 和其他功能。这可以快速完成吗? 有大约。现在每个 table 100000 个元素。
使用:
CREATE TABLE test_tbl (
my_data timestamp(3)
);
INSERT INTO test_tbl VALUES
('2021-08-01 14:00:00.000'),
('2021-08-01 14:00:00.100'),
('2021-08-01 14:00:00.200'),
('2021-08-01 14:00:00.300'),
('2021-08-01 14:00:00.600'),
('2021-08-01 14:00:00.700'),
('2021-08-01 14:00:00.800'),
('2021-08-01 14:00:01.000');
SELECT
MIN(my_data) start,
MAX(my_data) end
FROM
( SELECT *
, CASE WHEN right(my_data,3) = @prev+100 THEN @i:=@i ELSE @i:=@i+100 END row_num
, @prev:=right(my_data,3) prev
FROM test_tbl
, (SELECT @prev:= null,@i:=0) vars
ORDER BY row_num
) x
GROUP BY row_num;
插入速度
经验法则——“如果每秒写入次数少于 100 次,请不要担心性能。” (你问的是每秒 10 次,对吗?)(有超过 100 次的技巧。)
非窗口解决方案(8.0 或 10.2 之前的版本)
-- 样本数据
CREATE TABLE gaps (
ts timestamp(3),
PRIMARY KEY(ts) -- Some index on ts is important here
);
INSERT INTO gaps VALUES
('2021-08-01 14:00:00.000'),
('2021-08-01 14:00:00.100'),
('2021-08-01 14:00:00.200'),
('2021-08-01 14:00:00.300'),
('2021-08-01 14:00:00.600'),
('2021-08-01 14:00:00.700'),
('2021-08-01 14:00:00.800'),
('2021-08-01 14:00:01.000'), -- By itself
('2021-08-01 14:00:01.300');
-- 添加序号(通过AUTO_INCREMENT);
警告:这取决于 auto_increment_increment = 1
DROP TEMPORARY TABLE IF EXISTS gaps2;
CREATE TEMPORARY TABLE gaps2 (
id INT AUTO_INCREMENT,
PRIMARY KEY(id) )
SELECT val
FROM (
( SELECT a.ts as val
FROM gaps AS a
LEFT JOIN gaps AS b ON a.ts = b.ts + interval 0.1 second
WHERE b.ts IS NULL
)
UNION ALL
( SELECT b.ts as val
FROM gaps AS b
LEFT JOIN gaps AS a ON b.ts = a.ts - interval 0.1 second
WHERE a.ts IS NULL
)
ORDER BY val
) AS cd;
select * from gaps2;
+----+-------------------------+
| id | val |
+----+-------------------------+
| 1 | 2021-08-01 14:00:00.000 |
| 2 | 2021-08-01 14:00:00.300 |
| 3 | 2021-08-01 14:00:00.600 |
| 4 | 2021-08-01 14:00:00.800 |
| 5 | 2021-08-01 14:00:01.000 |
| 6 | 2021-08-01 14:00:01.000 |
| 7 | 2021-08-01 14:00:01.300 |
| 8 | 2021-08-01 14:00:01.300 |
+----+-------------------------+
-- 枢轴:
SELECT CEIL(id/2) AS pair,
MIN(val) AS 'run_start',
MAX(val) AS 'run_end'
FROM gaps2
GROUP BY pair
ORDER BY pair;
+------+-------------------------+-------------------------+
| pair | run_start | run_end |
+------+-------------------------+-------------------------+
| 1 | 2021-08-01 14:00:00.000 | 2021-08-01 14:00:00.300 |
| 2 | 2021-08-01 14:00:00.600 | 2021-08-01 14:00:00.800 |
| 3 | 2021-08-01 14:00:01.000 | 2021-08-01 14:00:01.000 |
| 4 | 2021-08-01 14:00:01.300 | 2021-08-01 14:00:01.300 |
+------+-------------------------+-------------------------+
得到第一个答案后,我开始修改它,昨晚我得到了一个工作示例。
SELECT
MIN(my_data) start,
MAX(my_data) end
FROM
( SELECT *
, CASE WHEN UNIX_TIMESTAMP(my_data)*1000 >= @prev+90 and
UNIX_TIMESTAMP(my_data)*1000 <= @prev+110
THEN @i:=@i ELSE @i:=@i+100 END row_num
, @prev:=UNIX_TIMESTAMP(my_data)*1000 prev
FROM test_tbl
, (SELECT @prev:= null,@i:=0) vars
ORDER BY row_num
) x
GROUP BY row_num;
我会标记第一个答案,因为它给了灵感。但感谢您的输入。