是否可以再优化这些查询?
Is it possible to optimize these queries any more?
架构是这样的visits_table
:
+---------------------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------+----------------------+------+-----+---------+----------------+
| idvisit | int(10) unsigned | NO | PRI | NULL | auto_increment |
| idsite | int(10) unsigned | NO | MUL | NULL | |
| idvisitor | binary(8) | NO | | NULL | |
| visit_time | datetime | NO | | NULL | |
| user_id | varchar(200) | YES | | NULL | |
| config_cookie | tinyint(1) | NO | | NULL | |
| custom_var_k1 | varchar(200) | YES | | NULL | |
| custom_var_v1 | varchar(200) | YES | | NULL | |
+---------------------------+----------------------+------+-----+---------+----------------+
索引:
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| visits_table | 0 | PRIMARY | 1 | idvisit | A | 1502 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_datetime | 1 | idsite | A | 5 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_datetime | 2 | visit_time | A | 1502 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_idvisitor | 1 | idsite | A | 1 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_idvisitor | 2 | idvisitor | A | 500 | NULL | NULL | | BTREE | | |
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
并且我准备了两个查询:
SELECT
COUNT(`idvisit`) AS `visits_count`,
DATE(`visit_time`) AS `date`
FROM (
SELECT *
FROM
`visits_table`
WHERE
`idsite` = 2
AND `visit_time` >= '2015-04-01 00:00:00'
AND `visit_time` <= '2015-04-30 23:59:59'
) AS `visits`
WHERE 1
GROUP BY
DATE(`visit_time`);
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1469 | Using temporary; Using filesort |
| 2 | DERIVED | visits_table | ALL | index_idsite_datetime,index_idsite_idvisitor | NULL | NULL | NULL | 1502 | Using where |
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
in MySQL 5.6 in row 2 type = ref, key = index_idsite_datetime, key_len = 4, ref = const, Extra = Using index
SELECT
COUNT(`idvisit`) AS `visits_count`,
DATE(`visit_time`) AS `date`
FROM
`visits_table`
WHERE
`idsite` = 2
AND `visit_time` >= '2015-04-01 00:00:00'
AND `visit_time` <= '2015-04-30 23:59:59'
GROUP BY
DATE(`visit_time`);
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
| 1 | SIMPLE | visits_table | range | index_idsite_datetime,index_idsite_idvisitor | index_idsite_datetime | 12 | NULL | 1468 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
我 table 有 8600 万行,执行这两个查询大约需要 2 小时。我可以做些什么来加快这些查询的速度吗?
我建议将查询写成:
SELECT COUNT(*) AS `visits_count`,
DATE(`visit_time`) AS `date`
FROM `visits_table`
WHERE `idsite` = 2 AND
`visit_time` >= '2015-04-01' AND
`visit_time` < '2015-05-01'
GROUP BY DATE(`visit_time`);
这可能会节省一点时间,因为索引现在是覆盖索引。
我认为改进查询的一种方法是去掉 group by
。试试这样的查询:
select dte,
(select count(*)
from visits_table
where idsite = 2 and
visit_time >= dates.dte AND visit_time < dates.dte + interval 1 day
from (select date('2015-04-01') as dte union all
select date('2015-04-02') as dte
) dates;
MySQL 将索引用于相关子查询比将索引用于聚合要好得多。这种方法的缺点是时间会随着结果集中的天数线性增加。
架构是这样的visits_table
:
+---------------------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------+----------------------+------+-----+---------+----------------+
| idvisit | int(10) unsigned | NO | PRI | NULL | auto_increment |
| idsite | int(10) unsigned | NO | MUL | NULL | |
| idvisitor | binary(8) | NO | | NULL | |
| visit_time | datetime | NO | | NULL | |
| user_id | varchar(200) | YES | | NULL | |
| config_cookie | tinyint(1) | NO | | NULL | |
| custom_var_k1 | varchar(200) | YES | | NULL | |
| custom_var_v1 | varchar(200) | YES | | NULL | |
+---------------------------+----------------------+------+-----+---------+----------------+
索引:
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| visits_table | 0 | PRIMARY | 1 | idvisit | A | 1502 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_datetime | 1 | idsite | A | 5 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_datetime | 2 | visit_time | A | 1502 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_idvisitor | 1 | idsite | A | 1 | NULL | NULL | | BTREE | | |
| visits_table | 1 | index_idsite_idvisitor | 2 | idvisitor | A | 500 | NULL | NULL | | BTREE | | |
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
并且我准备了两个查询:
SELECT
COUNT(`idvisit`) AS `visits_count`,
DATE(`visit_time`) AS `date`
FROM (
SELECT *
FROM
`visits_table`
WHERE
`idsite` = 2
AND `visit_time` >= '2015-04-01 00:00:00'
AND `visit_time` <= '2015-04-30 23:59:59'
) AS `visits`
WHERE 1
GROUP BY
DATE(`visit_time`);
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1469 | Using temporary; Using filesort |
| 2 | DERIVED | visits_table | ALL | index_idsite_datetime,index_idsite_idvisitor | NULL | NULL | NULL | 1502 | Using where |
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
in MySQL 5.6 in row 2 type = ref, key = index_idsite_datetime, key_len = 4, ref = const, Extra = Using index
SELECT
COUNT(`idvisit`) AS `visits_count`,
DATE(`visit_time`) AS `date`
FROM
`visits_table`
WHERE
`idsite` = 2
AND `visit_time` >= '2015-04-01 00:00:00'
AND `visit_time` <= '2015-04-30 23:59:59'
GROUP BY
DATE(`visit_time`);
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
| 1 | SIMPLE | visits_table | range | index_idsite_datetime,index_idsite_idvisitor | index_idsite_datetime | 12 | NULL | 1468 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
我 table 有 8600 万行,执行这两个查询大约需要 2 小时。我可以做些什么来加快这些查询的速度吗?
我建议将查询写成:
SELECT COUNT(*) AS `visits_count`,
DATE(`visit_time`) AS `date`
FROM `visits_table`
WHERE `idsite` = 2 AND
`visit_time` >= '2015-04-01' AND
`visit_time` < '2015-05-01'
GROUP BY DATE(`visit_time`);
这可能会节省一点时间,因为索引现在是覆盖索引。
我认为改进查询的一种方法是去掉 group by
。试试这样的查询:
select dte,
(select count(*)
from visits_table
where idsite = 2 and
visit_time >= dates.dte AND visit_time < dates.dte + interval 1 day
from (select date('2015-04-01') as dte union all
select date('2015-04-02') as dte
) dates;
MySQL 将索引用于相关子查询比将索引用于聚合要好得多。这种方法的缺点是时间会随着结果集中的天数线性增加。