为什么 EXPLAIN 的输出在每个 SHOW 索引后都会改变?

Why does the output of EXPLAIN change after each SHOW index?

我试图通过使用 EXPLAIN 的索引来提高某些查询的性能,我注意到每次使用 SHOW index FROM TableB;rows 列的输出 EXPLAIN 查询已更改

例如:

mysql> EXPLAIN Select A.id
     From TableA A
     Inner join TableB B
         On A.address = B.address And A.code = B.code
     Group by A.id
     Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                                   | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | B     | index  | test_index    | PRIMARY | 518     | NULL                                  | 10561 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | A     | eq_ref | PRIMARY       | PRIMARY | 514     | db.B.address,db.B.code                |     1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)

mysql> show index from TableB;
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table     | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| TableB    |          0 | PRIMARY      |            1 | id          | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          0 | PRIMARY      |            2 | address     | A         |          21 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          0 | PRIMARY      |            3 | code        | A         |       10402 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          1 | test_index   |            1 | address     | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          1 | test_index   |            2 | code        | A         |       10402 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          1 | test_index   |            3 | id          | A         |       10402 |     NULL | NULL   |      | BTREE      |         |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.03 sec)

和...

mysql> EXPLAIN Select A.id
        From TableA A
        Inner join TableB B
           On A.address = B.address And A.code = B.code Group by A.id
        Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                                   | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | B     | index  | test_index    | PRIMARY | 518     | NULL                                  | 9800  | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | A     | eq_ref | PRIMARY       | PRIMARY | 514     | db.B.address,db.B.code                |     1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)

为什么会这样?

rows 列应仅作为粗略估计。这不是一个精确的数字。

它基于对查询期间将检查的行数的统计估计。在实际执行查询之前无法知道实际的行数。

统计数据基于定期从 table 读取的样本。偶尔会重新阅读这些示例,例如在您 运行 ANALYZE TABLE 或某些 INFORMATION_SCHEMA 查询或某些 SHOW 语句之后。

我不认为 20% 的统计差异有什么大不了的。在许多情况下,将图形想象成一条倒抛物线,您需要知道自己位于最低点的哪一侧。在优化器可能出错的复杂查询中,它需要的不仅仅是简单的统计数据,例如 MariaDB 10.0 / 10.1 的直方图。 (我没有足够的经验来判断这是否取得了很大进展。)

您的特定查询可能只会以一种方式执行,而不管统计信息如何。复杂查询的一个示例是 JOIN,其中 WHERE 个子句过滤每个 table。优化器必须决定从哪个 table 开始。另一种情况是单个 table 与 WHEREORDER BY 并且它们不能同时由单个索引处理 - 它是否应该使用索引来过滤,然后必须排序?还是应该为 ORDER BY 使用索引,但必须即时过滤?