查询以查找重复的时间戳 MySQL
Query to find duplicate timestamps MySQL
我编写了以下查询来查找日期范围内的重复时间戳,目的是删除那些具有较大 ID 的重复项。但是,此选择永远不会完成。
SELECT
*
FROM
data
WHERE
id NOT IN (SELECT
MIN(id)
FROM
data
WHERE
datapoint_name LIKE 'Temp%'
AND timestamp BETWEEN '2012-07-31' AND '2012-08-03'
group by timestamp , datapoint_name)
AND datapoint_name LIKE 'Temp%'
AND timestamp BETWEEN '2012-07-31' AND '2012-08-03';
我觉得很奇怪,因为各个组件运行非常快,而且没有那么多行。具体来说:
- SELECT MIN(ID) ... GROUP BY 子查询 returns 0.7 秒内 476 行
- 外部 SELECT * 没有 id NOT IN() returns 0.001 秒内 490 行
换句话说,有 14 个重复项,但 NOT IN() 操作似乎花费了过多的时间。事实上,我从来没有耐心看它是否会完成。我该怎么做才能加快速度?我做错了什么根本性的事吗?
原因可能是子查询正在为被比较的每一行重新运行。尝试将子查询移动到 from
并使用 left join
:
SELECT d.*
FROM data d LEFT JOIN
(SELECT timestamp, datpoint_name, MIN(id) as minid
FROM data
WHERE datapoint_name LIKE 'Temp%' AND
timestamp BETWEEN '2012-07-31' AND '2012-08-03'
GROUP BY timestamp , datapoint_name
) dd
ON d.datapoint_name = dd.datapoint_name and
d.timestamp = dd.timestamp and
d.id = dd.minid
WHERE d.datapoint_name LIKE 'Temp%' AND
d.timestamp BETWEEN '2012-07-31' AND '2012-08-03' AND
dd.minid IS NULL;
我编写了以下查询来查找日期范围内的重复时间戳,目的是删除那些具有较大 ID 的重复项。但是,此选择永远不会完成。
SELECT
*
FROM
data
WHERE
id NOT IN (SELECT
MIN(id)
FROM
data
WHERE
datapoint_name LIKE 'Temp%'
AND timestamp BETWEEN '2012-07-31' AND '2012-08-03'
group by timestamp , datapoint_name)
AND datapoint_name LIKE 'Temp%'
AND timestamp BETWEEN '2012-07-31' AND '2012-08-03';
我觉得很奇怪,因为各个组件运行非常快,而且没有那么多行。具体来说:
- SELECT MIN(ID) ... GROUP BY 子查询 returns 0.7 秒内 476 行
- 外部 SELECT * 没有 id NOT IN() returns 0.001 秒内 490 行
换句话说,有 14 个重复项,但 NOT IN() 操作似乎花费了过多的时间。事实上,我从来没有耐心看它是否会完成。我该怎么做才能加快速度?我做错了什么根本性的事吗?
原因可能是子查询正在为被比较的每一行重新运行。尝试将子查询移动到 from
并使用 left join
:
SELECT d.*
FROM data d LEFT JOIN
(SELECT timestamp, datpoint_name, MIN(id) as minid
FROM data
WHERE datapoint_name LIKE 'Temp%' AND
timestamp BETWEEN '2012-07-31' AND '2012-08-03'
GROUP BY timestamp , datapoint_name
) dd
ON d.datapoint_name = dd.datapoint_name and
d.timestamp = dd.timestamp and
d.id = dd.minid
WHERE d.datapoint_name LIKE 'Temp%' AND
d.timestamp BETWEEN '2012-07-31' AND '2012-08-03' AND
dd.minid IS NULL;