MySQL 减少巨大的索引大小 table

Question

对于我的在线商店，我有一个 table，用于搜索：

CREATE TABLE `store_search` (
  `term` varchar(50) NOT NULL DEFAULT '',
  `content_id` int(10) unsigned NOT NULL,
  `type` enum('keyword','tag') NOT NULL DEFAULT 'keyword',
  `random` int(10) unsigned NOT NULL,
  `saving` int(10) unsigned NOT NULL,
  PRIMARY KEY (`content_id`,`term`,`type`),
  UNIQUE KEY `saving` (`term`,`saving`,`random`,`content_id`,`type`),
  UNIQUE KEY `random` (`term`,`random`,`content_id`,`type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED

产品可以通过两种方式列出：随机排序（基于 random 列）或折扣（基于 saving 列）。过去的测试表明，对顺序使用 UNIQUE 约束比将标准索引与 ORDER BY 结合使用的性能要好得多。查询可能如下所示：

mysql> EXPLAIN SELECT content_id FROM store_search USE INDEX (random) WHERE term LIKE 'shirt%' AND type='keyword' LIMIT 2000,100;
+----+-------------+--------------+-------+---------------+--------+---------+------+---------+--------------------------+
| id | select_type | table        | type  | possible_keys | key    | key_len | ref  | rows    | Extra                    |
+----+-------------+--------------+-------+---------------+--------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | store_search | range | random        | random | 152     | NULL | 9870580 | Using where; Using index |
+----+-------------+--------------+-------+---------------+--------+---------+------+---------+--------------------------+

所以我可以防止 ORDER BY 子句（没有文件排序是用这种方法完成的）。 PRIMARY KEY 用于搜索多个词时的自连接：

mysql> EXPLAIN SELECT DISTINCT x.content_id
    -> FROM store_search x USE INDEX (saving)
    -> INNER JOIN store_search y ON x.content_id=y.content_id
    -> WHERE x.term LIKE 'shirt%' AND x.type='keyword' AND y.term LIKE 'blue%' AND y.type='keyword'
    -> LIMIT 0,100;
+----+-------------+-------+-------+-----------------------+---------+---------+--------------+----------+-------------------------------------------+
| id | select_type | table | type  | possible_keys         | key     | key_len | ref          | rows     | Extra                                     |
+----+-------------+-------+-------+-----------------------+---------+---------+--------------+----------+-------------------------------------------+
|  1 | SIMPLE      | x     | range | PRIMARY,saving,random | saving  | 152     | NULL         | 11449970 | Using where; Using index; Using temporary |
|  1 | SIMPLE      | y     | ref   | PRIMARY,saving,random | PRIMARY | 4       | x.content_id |       20 | Using where; Using index; Distinct        |
+----+-------------+-------+-------+-----------------------+---------+---------+--------------+----------+-------------------------------------------+

正如我所说，这个解决方案到目前为止还不错。我现在的问题是：这个 table 目前太大了（~500mio 行），索引不再适合内存。这导致 INSERT 和 UPDATE 语句非常慢。数据占用 23GB，索引占用 32GB，所以这个 table 总共占用 55GB。测试是可能的，但是复制这个table时会消耗很多时间，但是有没有人有减少索引大小的方法？我想将字符串列的排序规则转换为 latin_1，但我可以合并一些索引吗？

Answer 1

term LIKE 'shirt%' 是 范围查找 。 INDEX(term, ...) 将不会通过 term 过滤以到达 type 或其他列。

我的 Index Cookbook.

中讨论了这个和其他基本索引原则

所以...WHERE term LIKE 'shirt%' AND type='keyword' 请求 INDEX(keyword, term)。添加任何其他列对过滤.

没有帮助

然而...你所依赖的是覆盖。这是所有需要的列都在一个索引中的地方。在这种情况下，可以在索引 BTree 中执行查询，而无需触及数据 BTree。也就是说，添加额外的列可以是有益的。

在

中发生了多件事

SELECT  content_id
    FROM  store_search USE INDEX (random)
    WHERE  term LIKE 'shirt%'
      AND  type='keyword'
    LIMIT  2000,100; 
UNIQUE KEY `random` (`term`,`random`,`content_id`,`type`)

这里有一些：

索引为"covering".
没有 ORDER BY，因此输出可能首先按 term 排序（假设可能有多个以 'shirt' 开头的短语），仅次于 random。这不是您想要的，但可能会奏效。
LIMIT 要求它扫描索引的 2000+100 行，然后退出。如果没有足够的衬衫，它将很快停止。这可能看起来 "fast".
UNIQUE 可能是无关紧要的，而且对于插入来说是浪费。

下一个查询我们来剖析一下SELECT DISTINCT x.content_id ....

您已将 "filesort" 替换为与 DISTINCT 类似（可能更快）的代码。可能没有净收益；计时。
如果有 999 件蓝色衬衫，它将找到所有 999 件，然后区分它们，然后交付其中的 100 件。
没有 ORDER BY，您无法预测将交付哪 100 个。
由于您已经收集了全部 999，添加 ORDER BY RAND() 不会增加太多开销。
您真的要退回 'blue-green' 件衬衫，而不是 'light blue' 件吗？那么 'dress%' 接 'dress pants' 呢？淫.

底线

仅用 PRIMARY KEY(type, term, content_id) 替换 3 个索引。通过 PK 进入，您可以有效地获得 "covering".
使用 ORDER BY random 或 ORDER BY RAND() -- 看看哪个更适合您。（后者更随意！）
重新考虑 LIKE 'shirt%'

归根结底，EAV 模式设计很糟糕。我讨论这个 further.

MySQL 减少巨大的索引大小 table

MySQL reduce index size for huge table

mysql

indexing

innodb

entity-attribute-value

mysql-5.6