order by 使查询变慢
order by makes query slow
我有两个表:
video (ID, TITLE, ..., UPLOADED_DATE)
join_video_category (ID (not used), ID_VIDEO_ ID_CATEGORY)
视频行数:4 500 000 |
join_video_category 中的行数:5 800 000
1个视频可以有多个类别。
我有一个查询完美运行,最多 20 毫秒即可获得结果:
SELECT * FROM video WHERE ID IN
(SELECT ID_VIDEO FROM join_video_category WHERE ID_CATEGORY=11)
LIMIT 1000;
此查询取1000个视频,顺序不重要。
但是,当我想从一个类别中获取 10 个最新视频时,我的查询需要大约 30-40 秒:
SELECT * FROM video WHERE ID IN
(SELECT ID_VIDEO FROM join_video_category WHERE ID_CATEGORY=11)
ORDER BY UPLOADED_DATE DESC LIMIT 10;
我在 ID_CATEGORY、ID_VIDEO、UPLOADED_DATE、ID 视频和 join_video_category.
上有索引
我已经在查询中使用 JOIN 对其进行了测试,结果相同。
首先,比较的是两个截然不同的查询。第一个returns一堆视频随便碰到。第二个必须阅读 所有 视频,然后对它们进行排序。
尝试将其重写为 JOIN
:
SELECT v.*
FROM video v JOIN
join_video_category vc
ON v.id = bc.id_video
WHERE vc.ID_CATEGORY = 11
ORDER BY v.UPLOADED_DATE DESC
LIMIT 10;
这可能有帮助,也可能没有帮助。您有大量数据,因此对于给定类别您可能有很多视频。如果是这样,获取更多最新数据的 where
子句可能真的有帮助:
SELECT v.*
FROM video v JOIN
join_video_category vc
ON v.id = bc.id_video
WHERE vc.ID_CATEGORY = 11 AND v.UPLOADED_DATE >= '2015-01-01'
ORDER BY v.UPLOADED_DATE DESC
LIMIT 10;
最后,如果这不起作用,请考虑将 UPLOADED_DATE
之类的内容添加到 join_video_category
中。那么,这个查询应该是:
select vc.video_id
from join_vdeo_category vc
where vc.ID_CATEGORY = 11
order by vc.UPLOADED_DATE desc
limit 10;
索引在 join_video_category(id_category, uploaded_date, video_id)
。
解决方案 #1:
将 "in" 替换为 "exists" 会提高性能,请尝试以下查询。
SELECT * FROM video WHERE exists
(SELECT * FROM join_video_category WHERE ID_CATEGORY=11 AND join_video_category.ID_VIDEO = video.ID)
ORDER BY UPLOADED_DATE DESC LIMIT 10;
解决方案 #2:
1) 创建 tem_table
CREATE TABLE TEMP_TABLE AS SELECT * FROM join_video_category WHERE ID_CATEGORY=11;
2) 在解决方案 #1
中使用温度 table
SELECT * FROM video WHERE exists
(SELECT * FROM temp_table WHERE temp_table.ID_VIDEO = video.ID)
ORDER BY UPLOADED_DATE DESC LIMIT 10;
祝你好运!!
如果是1:Many,则不要在视频和类别之间使用额外的table。但是,您的行数表明它是 Many:Many.
如果是1:Many,只需在视频table中有category_id,然后简化所有查询。
如果是Many:Many,那么结点一定要用这个模式table:
CREATE TABLE map_video_category (
video_id ...,
category_id ...,
PRIMARY KEY(video_id, category_id), -- both ids, one direction
INDEX (category_id, video_id) -- both ids, the other direction
) ENGINE=InnoDB; -- significantly better than MyISAM on INDEX handling here
你说的这个ID是废了。复合键适用于所有情况,并且在大多数情况下都会提高性能。
不要使用IN ( SELECT ... )
;优化器在优化它方面做得很差。更改为 JOIN
、LEFT JOIN
、EXISTS
或其他结构。
我有两个表:
video (ID, TITLE, ..., UPLOADED_DATE)
join_video_category (ID (not used), ID_VIDEO_ ID_CATEGORY)
视频行数:4 500 000 | join_video_category 中的行数:5 800 000
1个视频可以有多个类别。
我有一个查询完美运行,最多 20 毫秒即可获得结果:
SELECT * FROM video WHERE ID IN
(SELECT ID_VIDEO FROM join_video_category WHERE ID_CATEGORY=11)
LIMIT 1000;
此查询取1000个视频,顺序不重要。
但是,当我想从一个类别中获取 10 个最新视频时,我的查询需要大约 30-40 秒:
SELECT * FROM video WHERE ID IN
(SELECT ID_VIDEO FROM join_video_category WHERE ID_CATEGORY=11)
ORDER BY UPLOADED_DATE DESC LIMIT 10;
我在 ID_CATEGORY、ID_VIDEO、UPLOADED_DATE、ID 视频和 join_video_category.
上有索引我已经在查询中使用 JOIN 对其进行了测试,结果相同。
首先,比较的是两个截然不同的查询。第一个returns一堆视频随便碰到。第二个必须阅读 所有 视频,然后对它们进行排序。
尝试将其重写为 JOIN
:
SELECT v.*
FROM video v JOIN
join_video_category vc
ON v.id = bc.id_video
WHERE vc.ID_CATEGORY = 11
ORDER BY v.UPLOADED_DATE DESC
LIMIT 10;
这可能有帮助,也可能没有帮助。您有大量数据,因此对于给定类别您可能有很多视频。如果是这样,获取更多最新数据的 where
子句可能真的有帮助:
SELECT v.*
FROM video v JOIN
join_video_category vc
ON v.id = bc.id_video
WHERE vc.ID_CATEGORY = 11 AND v.UPLOADED_DATE >= '2015-01-01'
ORDER BY v.UPLOADED_DATE DESC
LIMIT 10;
最后,如果这不起作用,请考虑将 UPLOADED_DATE
之类的内容添加到 join_video_category
中。那么,这个查询应该是:
select vc.video_id
from join_vdeo_category vc
where vc.ID_CATEGORY = 11
order by vc.UPLOADED_DATE desc
limit 10;
索引在 join_video_category(id_category, uploaded_date, video_id)
。
解决方案 #1: 将 "in" 替换为 "exists" 会提高性能,请尝试以下查询。
SELECT * FROM video WHERE exists
(SELECT * FROM join_video_category WHERE ID_CATEGORY=11 AND join_video_category.ID_VIDEO = video.ID)
ORDER BY UPLOADED_DATE DESC LIMIT 10;
解决方案 #2:
1) 创建 tem_table
CREATE TABLE TEMP_TABLE AS SELECT * FROM join_video_category WHERE ID_CATEGORY=11;
2) 在解决方案 #1
中使用温度 tableSELECT * FROM video WHERE exists
(SELECT * FROM temp_table WHERE temp_table.ID_VIDEO = video.ID)
ORDER BY UPLOADED_DATE DESC LIMIT 10;
祝你好运!!
如果是1:Many,则不要在视频和类别之间使用额外的table。但是,您的行数表明它是 Many:Many.
如果是1:Many,只需在视频table中有category_id,然后简化所有查询。
如果是Many:Many,那么结点一定要用这个模式table:
CREATE TABLE map_video_category (
video_id ...,
category_id ...,
PRIMARY KEY(video_id, category_id), -- both ids, one direction
INDEX (category_id, video_id) -- both ids, the other direction
) ENGINE=InnoDB; -- significantly better than MyISAM on INDEX handling here
你说的这个ID是废了。复合键适用于所有情况,并且在大多数情况下都会提高性能。
不要使用IN ( SELECT ... )
;优化器在优化它方面做得很差。更改为 JOIN
、LEFT JOIN
、EXISTS
或其他结构。