查找日期相差两分钟的所有行

find all rows that have two minutes difference in date

我有一个 table ACQUISITION,有 1 720 208 行。

------------------------------------------------------
| id           | date                    | value     |
|--------------|-------------------------|-----------|
| 1820188      | 2011-01-22 17:48:56     | 1.287     |
| 1820187      | 2011-01-21 21:55:11     | 2.312     |
| 1820186      | 2011-01-21 21:54:00     | 2.313     |
| 1820185      | 2011-01-20 17:46:10     | 1.755     |
| 1820184      | 2011-01-20 17:45:05     | 1.785     |
| 1820183      | 2011-01-19 18:21:02     | 2.001     |
------------------------------------------------------

遇到问题后,我需要找到相差小于两分钟的每一行。

理想情况下我应该能在这里找到:

| 1820187      | 2011-01-21 21:55:11     | 2.312     |
| 1820186      | 2011-01-21 21:54:00     | 2.313     |
| 1820185      | 2011-01-20 17:46:10     | 1.755     |
| 1820184      | 2011-01-20 17:45:05     | 1.785     |

如果你有任何想法,我在这里完全迷路了。

用 table 做一个 SELF JOIN 并使用 TIMEDIFF() 功能,如

SELECT t1.* 
from ACQUISITION t1 JOIN ACQUISITION t2
ON TIMEDIFF(t1.`date`, t2.`date`) <= 2;

让我们以微妙的方式重述您的问题,以便我们可以在宇宙热寂之前完成此查询。

"I need to know the consecutive records in the table with timestamps closer together than two minutes."

我们可以将 "consecutive" 的概念与您的 ID 值联系起来。

试试这个查询,看看你是否获得了不错的性能 (http://sqlfiddle.com/#!9/28738/2/0)

SELECT a.date first_date, a.id first_id, a.value first_value,
       b.id second_id, b.value second_value,
       TIMESTAMPDIFF(SECOND, a.date, b.date) delta_t
  FROM thetable AS a
  JOIN thetable AS b  ON b.id = a.id + 1 
                     AND b.date <= a.date + INTERVAL 2 MINUTE

ON b.id = a.id + 1 使自连接工作负载紧随其后。并且,避免在两个 date 列值之一上使用函数允许查询利用该列上可用的任何索引。

(id,date,value) 上创建覆盖索引将有助于此查询的性能。

如果连续行假设在此数据集中不起作用,您可以试试这个,将每一行与接下来的十行进行比较。它会更慢。 (http://sqlfiddle.com/#!9/28738/6/0)

SELECT a.date first_date, a.id first_id, a.value first_value,
       b.id second_id, b.value second_value,
       TIMESTAMPDIFF(SECOND, a.date, b.date) delta_t
  FROM thetable AS a
  JOIN thetable AS b  ON b.id <= a.id + 10
                     AND b.id >  a.id 
                     AND b.date <= a.date + INTERVAL 2 MINUTE

如果 id 值作为一种对行进行排序的方式完全没有价值,那么您将需要它。而且,它会很慢。 (http://sqlfiddle.com/#!9/28738/5/0)

SELECT a.date first_date, a.id first_id, a.value first_value,
       b.id second_id, b.value second_value,
       TIMESTAMPDIFF(SECOND, a.date, b.date) delta_t
  FROM thetable AS a
  JOIN thetable AS b  ON b.date <= a.date + INTERVAL 2 MINUTE
                     AND b.date >  a.date
                     AND b.id <> a.id