MySQL大table性能优化

MySQL large table performance optimization

我正在尝试解决这个 table

的性能问题
+--------------+------------------+------+-----+---------+----------------+
| Field        | Type             | Null | Key | Default | Extra          |
+--------------+------------------+------+-----+---------+----------------+
| id           | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| direction_id | int(10) unsigned | NO   | MUL | NULL    |                |
| created_at   | datetime         | NO   |     | NULL    |                |
| rate         | decimal(16,6)    | NO   |     | NULL    |                |
+--------------+------------------+------+-----+---------+----------------+

其中包含大约 1 亿行

只有一个查询从中选择数据 table:

SELECT AVG(rate) AS rate, created_at 
FROM statistics 
WHERE direction_id = ? 
AND created_at BETWEEN ? AND ? 
GROUP BY created_at

direction_id是外键但是选择性很差:

+----+-------------+------------+------------+------+---------------------------------+---------------------------------+---------+-------+-------+----------+---------------------------------------------------------------------+
| id | select_type | table      | partitions | type | possible_keys                   | key                             | key_len | ref   | rows  | filtered | Extra                                                               |
+----+-------------+------------+------------+------+---------------------------------+---------------------------------+---------+-------+-------+----------+---------------------------------------------------------------------+
|  1 | SIMPLE      | statistics | NULL       | ref  | statistics_direction_id_foreign | statistics_direction_id_foreign | 4       | const | 26254 |    11.11 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+------------+------------+------+---------------------------------+---------------------------------+---------+-------+-------+----------+---------------------------------------------------------------------+

所以我正在寻找解决这个问题的方法并需要建议。 按 HASH(direction_id) 分区对我有帮助吗? 如果有帮助,最好的方法是什么?

或者也许有其他方法可以修复它。

首先,让我们修正您的查询,使其成为有效的聚合查询。据推测,您想要 rate 的日平均值,因此:

SELECT AVG(rate) AS rate, DATE(created_at) as created_day
FROM statistics 
WHERE direction_id = ? AND created_at BETWEEN ? AND ? 
GROUP BY DATE(created_at)

那么,我建议创建以下索引:

create index idx_statistics on statistics (direction_id, created_at, rate);

在 MySQL 的最新版本中,我们还可以考虑在 date(create_at) 上使用索引。如果您可以接受以下 where 子句:

WHERE direction_id = ? AND DATE(created_at) BETWEEN ? AND ? 

那么下面的索引就派上用场了:

create index idx_statistics on statistics (direction_id, (date(created_at)), rate);

对于平均每日费率,您是指这个吗?

SELECT AVG(rate) AS rate, 
       DATE(created_at) 
    FROM statistics 
    WHERE direction_id = ? 
      AND created_at BETWEEN ? AND ? 
    GROUP BY DATE(created_at)

并且有INDEX(direction_id, created, rate)——它既是“覆盖”又是“复合”。 explain 会说“Using index”来表示“covering”,这表明整个查询可以只看索引的 BTree 来执行。因此,“覆盖”提供了额外的性能提升。

更改为涉及 DATE(created_at) 的奇特索引可能无助于 在此查询中

PARTITIONing 表示。

可能会显示“汇总表”。 http://mysql.rjweb.org/doc.php/summarytables