是否可以再优化这些查询?

Is it possible to optimize these queries any more?

架构是这样的visits_table:

+---------------------------+----------------------+------+-----+---------+----------------+
| Field                     | Type                 | Null | Key | Default | Extra          |
+---------------------------+----------------------+------+-----+---------+----------------+
| idvisit                   | int(10) unsigned     | NO   | PRI | NULL    | auto_increment |
| idsite                    | int(10) unsigned     | NO   | MUL | NULL    |                |
| idvisitor                 | binary(8)            | NO   |     | NULL    |                |
| visit_time                | datetime             | NO   |     | NULL    |                |
| user_id                   | varchar(200)         | YES  |     | NULL    |                |
| config_cookie             | tinyint(1)           | NO   |     | NULL    |                |
| custom_var_k1             | varchar(200)         | YES  |     | NULL    |                |
| custom_var_v1             | varchar(200)         | YES  |     | NULL    |                |
+---------------------------+----------------------+------+-----+---------+----------------+

索引:

+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                | Non_unique | Key_name                     | Seq_in_index | Column_name            | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| visits_table         |          0 | PRIMARY                      |            1 | idvisit                | A         |        1502 | NULL     | NULL   |      | BTREE      |         |               |
| visits_table         |          1 | index_idsite_datetime        |            1 | idsite                 | A         |           5 | NULL     | NULL   |      | BTREE      |         |               |
| visits_table         |          1 | index_idsite_datetime        |            2 | visit_time             | A         |        1502 | NULL     | NULL   |      | BTREE      |         |               |
| visits_table         |          1 | index_idsite_idvisitor       |            1 | idsite                 | A         |           1 | NULL     | NULL   |      | BTREE      |         |               |
| visits_table         |          1 | index_idsite_idvisitor       |            2 | idvisitor              | A         |         500 | NULL     | NULL   |      | BTREE      |         |               |
+----------------------+------------+------------------------------+--------------+------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

并且我准备了两个查询:

SELECT 
    COUNT(`idvisit`) AS `visits_count`, 
    DATE(`visit_time`) AS `date` 
FROM (
    SELECT * 
    FROM 
        `visits_table` 
    WHERE 
        `idsite` = 2 
        AND `visit_time` >= '2015-04-01 00:00:00' 
        AND `visit_time` <= '2015-04-30 23:59:59'
) AS `visits`
WHERE 1 
GROUP BY 
    DATE(`visit_time`); 



+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
| id | select_type | table                | type | possible_keys                                | key  | key_len | ref  | rows | Extra                           |
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+
|  1 | PRIMARY     | <derived2>           | ALL  | NULL                                         | NULL | NULL    | NULL | 1469 | Using temporary; Using filesort |
|  2 | DERIVED     | visits_table         | ALL  | index_idsite_datetime,index_idsite_idvisitor | NULL | NULL    | NULL | 1502 | Using where                     |
+----+-------------+----------------------+------+----------------------------------------------+------+---------+------+------+---------------------------------+

in MySQL 5.6 in row 2 type = ref, key = index_idsite_datetime, key_len = 4, ref = const, Extra = Using index

    SELECT 
        COUNT(`idvisit`) AS `visits_count`, 
        DATE(`visit_time`) AS `date` 
    FROM 
        `visits_table` 
    WHERE 
        `idsite` = 2 
        AND `visit_time` >= '2015-04-01 00:00:00' 
        AND `visit_time` <= '2015-04-30 23:59:59'
    GROUP BY 
        DATE(`visit_time`);

+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
| id | select_type | table                | type  | possible_keys                                | key                   | key_len | ref  | rows | Extra                                                     |
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | visits_table         | range | index_idsite_datetime,index_idsite_idvisitor | index_idsite_datetime | 12      | NULL | 1468 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+----------------------+-------+----------------------------------------------+-----------------------+---------+------+------+-----------------------------------------------------------+

我 table 有 8600 万行,执行这两个查询大约需要 2 小时。我可以做些什么来加快这些查询的速度吗?

我建议将查询写成:

SELECT COUNT(*) AS `visits_count`, 
       DATE(`visit_time`) AS `date` 
FROM  `visits_table` 
WHERE  `idsite` = 2 AND
       `visit_time` >= '2015-04-01' AND
       `visit_time` < '2015-05-01'
GROUP BY DATE(`visit_time`);

这可能会节省一点时间,因为索引现在是覆盖索引。

我认为改进查询的一种方法是去掉 group by。试试这样的查询:

select dte,
       (select count(*)
        from visits_table
        where idsite = 2 and
              visit_time >= dates.dte AND visit_time < dates.dte + interval 1 day
from (select date('2015-04-01') as dte union all
      select date('2015-04-02') as dte
     ) dates;

MySQL 将索引用于相关子查询比将索引用于聚合要好得多。这种方法的缺点是时间会随着结果集中的天数线性增加。