MySQL: 为什么使用 VIEW 的查询与直接使用视图的底层 JOIN 的查询相比效率较低?
MySQL: why is a query using a VIEW less efficient compared to a query directly using the view's underlying JOIN?
我有三个表,bug
、bugrule
和 bugtrace
,它们的关系是:
bug 1--------N bugrule
id = bugid
bugrule 0---------N bugtrace
id = ruleid
因为我几乎总是对 bug <---> bugtrace
之间的关系感兴趣,所以我创建了一个合适的 VIEW
用作多个查询的一部分。有趣的是,使用此 VIEW
的查询的性能明显低于显式使用基础 JOIN
的等效查询。
VIEW
定义:
CREATE VIEW bugtracev AS
SELECT t.*, r.bugid
FROM bugtrace AS t
LEFT JOIN bugrule AS r ON t.ruleid=r.id
WHERE r.version IS NULL
使用 VIEW
(性能差)的查询执行计划:
mysql> explain
SELECT c.id,state,
(SELECT COUNT(DISTINCT(t.id)) FROM bugtracev AS t
WHERE t.bugid=c.id)
FROM bug AS c
WHERE c.version IS NULL
AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| 1 | PRIMARY | c | range | id_2,id | id_2 | 8 | NULL | 3 | Using index condition |
| 2 | DEPENDENT SUBQUERY | t | index | NULL | ruleid | 9 | NULL | 1426004 | Using index |
| 2 | DEPENDENT SUBQUERY | r | ref | id_2,id | id_2 | 8 | bugapp.t.ruleid | 1 | Using where |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
3 rows in set (0.00 sec)
直接使用底层 JOIN
的查询执行计划(良好的性能):
mysql> explain
SELECT c.id,state,
(SELECT COUNT(DISTINCT(t.id))
FROM bugtrace AS t
LEFT JOIN bugrule AS r ON t.ruleid=r.id
WHERE r.version IS NULL
AND r.bugid=c.id)
FROM bug AS c
WHERE c.version IS NULL
AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| 1 | PRIMARY | c | range | id_2,id | id_2 | 8 | NULL | 3 | Using index condition |
| 2 | DEPENDENT SUBQUERY | r | ref | id_2,id,bugid | bugid | 8 | bugapp.c.id | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | t | ref | ruleid | ruleid | 9 | bugapp.r.id | 713002 | Using index |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
3 rows in set (0.00 sec)
CREATE TABLE
语句(减少了不相关的列)是:
mysql> show create table bug;
CREATE TABLE `bug` (
`id` bigint(20) NOT NULL,
`version` int(11) DEFAULT NULL,
`state` varchar(16) DEFAULT NULL,
UNIQUE KEY `id_2` (`id`,`version`),
KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> show create table bugrule;
CREATE TABLE `bugrule` (
`id` bigint(20) NOT NULL,
`version` int(11) DEFAULT NULL,
`bugid` bigint(20) NOT NULL,
UNIQUE KEY `id_2` (`id`,`version`),
KEY `id` (`id`),
KEY `bugid` (`bugid`),
CONSTRAINT `bugrule_ibfk_1` FOREIGN KEY (`bugid`) REFERENCES `bug` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> show create table bugtrace;
CREATE TABLE `bugtrace` (
`id` bigint(20) NOT NULL,
`ruleid` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ruleid` (`ruleid`),
CONSTRAINT `bugtrace_ibfk_1` FOREIGN KEY (`ruleid`) REFERENCES `bugrule` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
您问 为什么 关于使用 COUNT(DISTINCT val)
和依赖子查询的几个复杂查询的查询优化。很难确定为什么。
不过,您可能会通过摆脱依赖子查询来解决大部分性能问题。尝试这样的事情:
SELECT c.id,state, cnt.cnt
FROM bug AS c
LEFT JOIN (
SELECT bugid, COUNT(DISTINCT id) cnt
FROM bugtracev
GROUP BY bugid
) cnt ON c.id = cnt.bugid
WHERE c.version IS NULL
AND c.id<10;
为什么这有帮助?为了满足查询,优化器可以选择 运行 GROUP BY
子查询一次,而不是多次。并且,您可以在 GROUP BY
子查询上使用 EXPLAIN
来了解其性能。
您还可以通过在 bugrule
上创建与您的视图中的查询匹配的复合索引来提高性能。试试这个。
CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)
并尝试像这样切换最后两列
CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)
这些索引被称为覆盖索引,因为它们包含满足您的查询所需的所有列。 version
首先出现,因为这有助于优化视图定义中的 WHERE version IS NULL
。这使它更快。
专业提示:避免在视图和查询中使用 SELECT *
,尤其是当您遇到性能问题时。相反,列出您实际需要的列。 *
可能会强制查询优化器避免覆盖索引,即使索引会有所帮助。
使用 MySQL 5.6(或更早版本)时,至少尝试 MySQL 5.7。根据What’s New in MySQL 5.7?:
We have to a large extent unified the handling of derived tables and views. Until now, subqueries in the FROM clause (derived tables) were unconditionally materialized, while views created from the same query expressions were sometimes materialized and sometimes merged into the outer query. This behavior, beside being inconsistent, can lead to a serious performance penalty.
我有三个表,bug
、bugrule
和 bugtrace
,它们的关系是:
bug 1--------N bugrule
id = bugid
bugrule 0---------N bugtrace
id = ruleid
因为我几乎总是对 bug <---> bugtrace
之间的关系感兴趣,所以我创建了一个合适的 VIEW
用作多个查询的一部分。有趣的是,使用此 VIEW
的查询的性能明显低于显式使用基础 JOIN
的等效查询。
VIEW
定义:
CREATE VIEW bugtracev AS
SELECT t.*, r.bugid
FROM bugtrace AS t
LEFT JOIN bugrule AS r ON t.ruleid=r.id
WHERE r.version IS NULL
使用 VIEW
(性能差)的查询执行计划:
mysql> explain
SELECT c.id,state,
(SELECT COUNT(DISTINCT(t.id)) FROM bugtracev AS t
WHERE t.bugid=c.id)
FROM bug AS c
WHERE c.version IS NULL
AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| 1 | PRIMARY | c | range | id_2,id | id_2 | 8 | NULL | 3 | Using index condition |
| 2 | DEPENDENT SUBQUERY | t | index | NULL | ruleid | 9 | NULL | 1426004 | Using index |
| 2 | DEPENDENT SUBQUERY | r | ref | id_2,id | id_2 | 8 | bugapp.t.ruleid | 1 | Using where |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
3 rows in set (0.00 sec)
直接使用底层 JOIN
的查询执行计划(良好的性能):
mysql> explain
SELECT c.id,state,
(SELECT COUNT(DISTINCT(t.id))
FROM bugtrace AS t
LEFT JOIN bugrule AS r ON t.ruleid=r.id
WHERE r.version IS NULL
AND r.bugid=c.id)
FROM bug AS c
WHERE c.version IS NULL
AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| 1 | PRIMARY | c | range | id_2,id | id_2 | 8 | NULL | 3 | Using index condition |
| 2 | DEPENDENT SUBQUERY | r | ref | id_2,id,bugid | bugid | 8 | bugapp.c.id | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | t | ref | ruleid | ruleid | 9 | bugapp.r.id | 713002 | Using index |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
3 rows in set (0.00 sec)
CREATE TABLE
语句(减少了不相关的列)是:
mysql> show create table bug;
CREATE TABLE `bug` (
`id` bigint(20) NOT NULL,
`version` int(11) DEFAULT NULL,
`state` varchar(16) DEFAULT NULL,
UNIQUE KEY `id_2` (`id`,`version`),
KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> show create table bugrule;
CREATE TABLE `bugrule` (
`id` bigint(20) NOT NULL,
`version` int(11) DEFAULT NULL,
`bugid` bigint(20) NOT NULL,
UNIQUE KEY `id_2` (`id`,`version`),
KEY `id` (`id`),
KEY `bugid` (`bugid`),
CONSTRAINT `bugrule_ibfk_1` FOREIGN KEY (`bugid`) REFERENCES `bug` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> show create table bugtrace;
CREATE TABLE `bugtrace` (
`id` bigint(20) NOT NULL,
`ruleid` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ruleid` (`ruleid`),
CONSTRAINT `bugtrace_ibfk_1` FOREIGN KEY (`ruleid`) REFERENCES `bugrule` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
您问 为什么 关于使用 COUNT(DISTINCT val)
和依赖子查询的几个复杂查询的查询优化。很难确定为什么。
不过,您可能会通过摆脱依赖子查询来解决大部分性能问题。尝试这样的事情:
SELECT c.id,state, cnt.cnt
FROM bug AS c
LEFT JOIN (
SELECT bugid, COUNT(DISTINCT id) cnt
FROM bugtracev
GROUP BY bugid
) cnt ON c.id = cnt.bugid
WHERE c.version IS NULL
AND c.id<10;
为什么这有帮助?为了满足查询,优化器可以选择 运行 GROUP BY
子查询一次,而不是多次。并且,您可以在 GROUP BY
子查询上使用 EXPLAIN
来了解其性能。
您还可以通过在 bugrule
上创建与您的视图中的查询匹配的复合索引来提高性能。试试这个。
CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)
并尝试像这样切换最后两列
CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)
这些索引被称为覆盖索引,因为它们包含满足您的查询所需的所有列。 version
首先出现,因为这有助于优化视图定义中的 WHERE version IS NULL
。这使它更快。
专业提示:避免在视图和查询中使用 SELECT *
,尤其是当您遇到性能问题时。相反,列出您实际需要的列。 *
可能会强制查询优化器避免覆盖索引,即使索引会有所帮助。
使用 MySQL 5.6(或更早版本)时,至少尝试 MySQL 5.7。根据What’s New in MySQL 5.7?:
We have to a large extent unified the handling of derived tables and views. Until now, subqueries in the FROM clause (derived tables) were unconditionally materialized, while views created from the same query expressions were sometimes materialized and sometimes merged into the outer query. This behavior, beside being inconsistent, can lead to a serious performance penalty.