Mysql table 的更好结构以获得 Mysql 性能

Question

我在 Mysql 性能方面评估更好的 table 结构，假设我有下面提到的 2 个 table 结构

Reference table structure 1 :

CREATE TABLE `references_1` (
  `id` bigint(30) NOT NULL AUTO_INCREMENT,
  `entity_id` int(11) DEFAULT NULL,
  `reference_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_on_entity_id` (`entity_id`),
  KEY `index_mappings_on_reference_id_and_reference_type` (`reference_id`,`reference_type`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8

mysql> select * from references_1 where entity_id = 1;
+----+-----------+--------------+------------------+
| id | entity_id | reference_id | reference_type   |
+----+-----------+--------------+------------------+
|  1 |         1 | 25636826     | reference_type_1 |
|  2 |         1 | 2563XCDA6826 | reference_type_2 |
|  3 |         1 | 16992176     | reference_type_3 |
|  4 |         1 | 4521882      | reference_type_4 |
+----+-----------+--------------+------------------+
4 rows in set (0.00 sec)


Reference table structure 2 :


CREATE TABLE `references_2` (
  `id` bigint(30) NOT NULL AUTO_INCREMENT,
  `entity_id` int(11) DEFAULT NULL,
  `reference_type_1` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type_2` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type_3` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type_4` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_on_entity_id` (`entity_id`),
  KEY `index_on_reference_type_1` (`reference_type_1`),
  KEY `index_on_reference_type_2` (`reference_type_2`),
  KEY `index_on_reference_type_3` (`reference_type_3`),
  KEY `index_on_reference_type_4` (`reference_type_4`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8

mysql> select * from references_2 where entity_id = 1;
+----+-----------+------------------+------------------+------------------+------------------+
| id | entity_id | reference_type_1 | reference_type_2 | reference_type_3 | reference_type_4 |
+----+-----------+------------------+------------------+------------------+------------------+
|  2 |         1 | 25636826         | 2563XCDA6826     | 16992176         | 4521882          |
+----+-----------+------------------+------------------+------------------+------------------+

随着数据的增长，哪种结构有助于提高查询性能？
mysql IO 是如何工作的？增加获取的行数是否会影响 IO 性能？
这里还有哪些其他因素需要考虑，如果我遗漏了的话。

请分享您的观点，提前致谢。

版本：

查询：

select * 来自 references_1 其中 entity_id = 1； //假设 entity_id 有索引。

INSERT（写操作）性能如何 w.r.t 两个 table 结构？

Answer 1

MySQL 的 InnoDB 存储引擎（默认）将行存储在固定大小的页面中（默认为每页 16KB）。一定数量的行适合单个页面，具体取决于行大小。 IE。如果行较小，每页适合更多行。

页是从存储器加载到 RAM 的数据增量。因此，如果您的查询引用该页面上的一行，则整个页面都会加载到 RAM 中，然后同一页面上的所有行都可以更快地访问。

不会拆分单个行。它将存储在同一页中（除了非常长的 varchar 或 text/blob 列，它们可以扩展到其他页）。

假设相同 entity_id 的行可能组合在一起，那么您的两个 table 设计之间的存储和性能差异确实非常接近。的确，在第一个设计中，您有额外的行，因此会有额外的 id 和 entity_id 实例。但这些只是一个 bigint 和一个 int，所以开销不大。其他列将使用相同的存储空间。

其他注意事项：

您是否期望将引用类型扩展到 5 种或更高？第二种设计要求您使用 ALTER TABLE.

添加一列

引用类型需要 varchar 吗？您可以将它编码为 tinyint 或 ENUM 吗？这将节省 space.

另一方面，使用您的第二个设计可以节省更多 space，因为引用类型只是元数据的一部分。因此他们只取 space 一次，而不是每一行。

Answer 2

考虑 references_1 的这种变化：

CREATE TABLE `references_3` (
  `entity_id` int(11) DEFAULT NULL,
  `reference_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (entity_id, reference_type),
  KEY `id_type` (`reference_id`,`reference_type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

备注：

我更改了 PK，假设该对是唯一的。
id 是否被任何其他 table 引用？
我支持 Bill 关于 ENUM、存储 space、第 5 类型等的评论
请参阅“pivot-table”了解如何将 4 行显示为 1 行。

Mysql table 的更好结构以获得 Mysql 性能

Better structure of Mysql table for Mysql performance

mysql

pivot-table

percona