JOIN 不以可预测的方式使用索引
JOINs not using indexes in a predictable manner
假设我有三个 table。
CREATE TABLE movies (
id INT AUTO_INCREMENT,
name VARCHAR(255),
PRIMARY KEY (id)
);
CREATE TABLE movies_actors (
id INT AUTO_INCREMENT,
movie_id INT,
actor_id INT,
current_salary_id INT,
PRIMARY KEY (id),
KEY movie_id (movie_id),
KEY actor_id (actor_id),
KEY current_salary_id (current_salary_id)
);
CREATE TABLE movies_actors_salaries (
id INT AUTO_INCREMENT,
actor_id INT,
compensation_type ENUM('salary','hourly','commission','lumpsum'),
amount DECIMAL(9,2),
date_agreed_upon DATETIME,
PRIMARY KEY (id),
KEY actor_id (actor_id)
);
我正在尝试加入 table 来做一些查询,索引偶尔会被使用,我不知道为什么。
SELECT COUNT(1)
FROM movies m
JOIN movies_actors ma ON m.id = ma.movie_id
JOIN movies_actors_salaries mas ON ma.current_salary_id = mas.id;
如果我对此进行解释,ma table 的额外列不会显示 "Using index"。我执行 LEFT JOIN movies_actors_salaries
还是 JOIN movies_actors_salaries
都没关系 - 它只是没有被使用。我不明白,因为 m.id 是电影的 PRIMARY KEY table 而 ma.movie_id 是 KEY。
我也尝试了另一个查询:
SELECT COUNT(1)
FROM movies m
JOIN movies_actors ma ON m.id = ma.movie_id
JOIN movies_actors_salaries mas ON ma.id = mas.actor_id;
如果我对此进行解释,ma table 的额外列不会显示 "Using index" 但如果我执行 LEFT JOIN movies_actors_salaries
而不是 JOIN
索引确实习惯了。同样,我不明白 - 为什么 movie_actor table 使用的索引取决于我加入 movies_actors_salaries table 的方式?
老实说,我一点都不明白。在我看来,当 EXPLAIN 完成时,所有四个的额外列(即上面两个带有 JOIN movies_actors_salaries
和 LEFT JOIN movies_actors_salaries
)应该说 "Using index".
我正在使用 Percona MySQL 5.5.35-33.0。有什么想法吗?
比 rows=1 和 Using where
更受关注的 ma
在这里看到:
mysql> explain SELECT COUNT(m.id) FROM movies m JOIN movies_actors ma ON m.id = ma.movie_id JOIN movies_actors_salaries mas ON ma.current_salary_id = mas.id;
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
| 1 | SIMPLE | ma | ALL | movie_id,current_salary_id | NULL | NULL | NULL | 1 | Using where |
| 1 | SIMPLE | mas | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.current_salary_id | 1 | Using index |
| 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.movie_id | 1 | Using index |
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
3 rows in set (0.05 sec)
是这里看到的最后一个密钥的掉落:
-- drop table movies_actors;
CREATE TABLE movies_actors (
id INT AUTO_INCREMENT,
movie_id INT,
actor_id INT,
current_salary_id INT,
PRIMARY KEY (id),
KEY movie_id (movie_id),
KEY actor_id (actor_id)
-- KEY current_salary_id (current_salary_id)
);
导致新的可怕 explain
行数=1024 和 Using where; Using join buffer (Block Nested Loop)
或 using filesort
或 using temporary
看到在上述架构更改和干扰行之后:
+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | mas | index | PRIMARY | actor_id | 5 | NULL | 1 | Using index |
| 1 | SIMPLE | ma | ALL | movie_id | NULL | NULL | NULL | 1024 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.movie_id | 1 | Using index |
+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+
外卖
Explain
是神秘的,就好像你不知道一样,但与刚刚提到的替代方案(即:1k 行和文件排序、临时表)相比,你的行数很低这一事实应该令人欣慰.
解释也是谎言。这是一个异想天开的幻想世界,预计会在几秒钟内呈现几行,但当 Explain
被删除时,它会根据实际情况改变路线。
我可以在 movies_actors_salaries
中有 1 行与您的连接相匹配,Using 索引会建议 mas
使用它,但我向您保证它不会因为这个 Manual Page 摘录:
Indexes are less important for queries on small tables, or big tables
where report queries process most or all of the rows. When a query
needs to access most of the rows, reading sequentially is faster than
working through an index. Sequential reads minimize disk seeks, even
if not all the rows are needed for the query.
所以你很高兴。密切关注 Explain
行数,以及文件排序和临时警告的使用。
假设我有三个 table。
CREATE TABLE movies (
id INT AUTO_INCREMENT,
name VARCHAR(255),
PRIMARY KEY (id)
);
CREATE TABLE movies_actors (
id INT AUTO_INCREMENT,
movie_id INT,
actor_id INT,
current_salary_id INT,
PRIMARY KEY (id),
KEY movie_id (movie_id),
KEY actor_id (actor_id),
KEY current_salary_id (current_salary_id)
);
CREATE TABLE movies_actors_salaries (
id INT AUTO_INCREMENT,
actor_id INT,
compensation_type ENUM('salary','hourly','commission','lumpsum'),
amount DECIMAL(9,2),
date_agreed_upon DATETIME,
PRIMARY KEY (id),
KEY actor_id (actor_id)
);
我正在尝试加入 table 来做一些查询,索引偶尔会被使用,我不知道为什么。
SELECT COUNT(1)
FROM movies m
JOIN movies_actors ma ON m.id = ma.movie_id
JOIN movies_actors_salaries mas ON ma.current_salary_id = mas.id;
如果我对此进行解释,ma table 的额外列不会显示 "Using index"。我执行 LEFT JOIN movies_actors_salaries
还是 JOIN movies_actors_salaries
都没关系 - 它只是没有被使用。我不明白,因为 m.id 是电影的 PRIMARY KEY table 而 ma.movie_id 是 KEY。
我也尝试了另一个查询:
SELECT COUNT(1)
FROM movies m
JOIN movies_actors ma ON m.id = ma.movie_id
JOIN movies_actors_salaries mas ON ma.id = mas.actor_id;
如果我对此进行解释,ma table 的额外列不会显示 "Using index" 但如果我执行 LEFT JOIN movies_actors_salaries
而不是 JOIN
索引确实习惯了。同样,我不明白 - 为什么 movie_actor table 使用的索引取决于我加入 movies_actors_salaries table 的方式?
老实说,我一点都不明白。在我看来,当 EXPLAIN 完成时,所有四个的额外列(即上面两个带有 JOIN movies_actors_salaries
和 LEFT JOIN movies_actors_salaries
)应该说 "Using index".
我正在使用 Percona MySQL 5.5.35-33.0。有什么想法吗?
比 rows=1 和 Using where
更受关注的 ma
在这里看到:
mysql> explain SELECT COUNT(m.id) FROM movies m JOIN movies_actors ma ON m.id = ma.movie_id JOIN movies_actors_salaries mas ON ma.current_salary_id = mas.id;
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
| 1 | SIMPLE | ma | ALL | movie_id,current_salary_id | NULL | NULL | NULL | 1 | Using where |
| 1 | SIMPLE | mas | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.current_salary_id | 1 | Using index |
| 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.movie_id | 1 | Using index |
+----+-------------+-------+--------+----------------------------+---------+---------+-----------------------------------+------+-------------+
3 rows in set (0.05 sec)
是这里看到的最后一个密钥的掉落:
-- drop table movies_actors;
CREATE TABLE movies_actors (
id INT AUTO_INCREMENT,
movie_id INT,
actor_id INT,
current_salary_id INT,
PRIMARY KEY (id),
KEY movie_id (movie_id),
KEY actor_id (actor_id)
-- KEY current_salary_id (current_salary_id)
);
导致新的可怕 explain
行数=1024 和 Using where; Using join buffer (Block Nested Loop)
或 using filesort
或 using temporary
看到在上述架构更改和干扰行之后:
+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | mas | index | PRIMARY | actor_id | 5 | NULL | 1 | Using index |
| 1 | SIMPLE | ma | ALL | movie_id | NULL | NULL | NULL | 1024 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.ma.movie_id | 1 | Using index |
+----+-------------+-------+--------+---------------+----------+---------+--------------------------+------+----------------------------------------------------+
外卖
Explain
是神秘的,就好像你不知道一样,但与刚刚提到的替代方案(即:1k 行和文件排序、临时表)相比,你的行数很低这一事实应该令人欣慰.
解释也是谎言。这是一个异想天开的幻想世界,预计会在几秒钟内呈现几行,但当 Explain
被删除时,它会根据实际情况改变路线。
我可以在 movies_actors_salaries
中有 1 行与您的连接相匹配,Using 索引会建议 mas
使用它,但我向您保证它不会因为这个 Manual Page 摘录:
Indexes are less important for queries on small tables, or big tables where report queries process most or all of the rows. When a query needs to access most of the rows, reading sequentially is faster than working through an index. Sequential reads minimize disk seeks, even if not all the rows are needed for the query.
所以你很高兴。密切关注 Explain
行数,以及文件排序和临时警告的使用。