在类别中搜索时如何优化连接

How to optimize join when search in categories

我有一个 table,里面有以下物品:

CREATE TABLE `ost_content` (
  `uid` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `type` enum('media','serial','season','series') NOT NULL,
  `alias` varchar(200) NOT NULL,
  `views` mediumint(7) NOT NULL DEFAULT '0',
  `ratings_count` enum('0','1','2','4','5') NOT NULL DEFAULT '0',
  `ratings_sum` mediumint(5) NOT NULL DEFAULT '0',
  `upload_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `conversion_status` enum('converting','error','success','announcement') NOT NULL DEFAULT 'converting',
  PRIMARY KEY (`uid`),
  UNIQUE KEY `idx_uid_type` (`uid`,`type`),
  KEY `idx_type` (`type`),
  KEY `idx_upload_date DESC` (`upload_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

和 table,将项目与类别联系起来:

CREATE TABLE `ost_categories2media` (
  `categories2media_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
  `categories2media_category_id` smallint(5) unsigned NOT NULL,
  `categories2media_uid` mediumint(8) unsigned NOT NULL,
  PRIMARY KEY (`categories2media_id`),
  KEY `categories2media_media_id` (`categories2media_uid`),
  KEY `categories2media_category_id` (`categories2media_category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=501114 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

然后,我正在执行查询:

SELECT
    c1.uid,
    c1.alias,
    c1.type,
    c1.views,
    c1.upload_date,
    c1.ratings_sum,
    c1.ratings_count,
    c1.conversion_status
FROM
    ost_content c1
LEFT JOIN ost_categories2media c2m ON c2m.categories2media_uid = c1.uid
WHERE
    c2m.categories2media_category_id = '53'
AND c1.conversion_status IN ('success', 'announcement')
AND c1.type IN ('serial', 'media')
ORDER BY
    c1.upload_date DESC
LIMIT 16, 16

它执行缓慢,categories2media_category_id 检查很多行:

+----+-------------+-------+--------+--------------------------------------------------------+------------------------------+---------+---------------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                                          | key                          | key_len | ref                             | rows  | Extra                                        |
+----+-------------+-------+--------+--------------------------------------------------------+------------------------------+---------+---------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | c2m   | ref    | categories2media_media_id,categories2media_category_id | categories2media_category_id | 2       | const                           | 32076 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | c1    | eq_ref | PRIMARY,idx_uid_type,idx_type                          | PRIMARY                      | 3       | uakino.c2m.categories2media_uid |     1 | Using where                                  |
+----+-------------+-------+--------+--------------------------------------------------------+------------------------------+---------+---------------------------------+-------+----------------------------------------------+

如何优化或重写此查询?

Mysql 索引就像厨师,太多的索引不是很有用,因为 mysql 每个 table 只使用一个索引。让我们看看ost_categories2media, 这是三列上的三个独立索引。你最好有两个像这样的索引。

  PRIMARY KEY (`categories2media_id`),
  KEY `categories2media_media_id` (`categories2media_uid`,`categories2media_category_id`)

现在 mysql 不再需要在 categories2media_uidcategories2media_category_id 上的索引之间做出决定,它有一个涵盖两者的索引!

看着你的 ost_content table 我们看到

  PRIMARY KEY (`uid`),
  UNIQUE KEY `idx_uid_type` (`uid`,`type`),
  KEY `idx_type` (`type`),
  KEY `idx_upload_date DESC` (`upload_date`)

其中一些索引有点多余。任何在 uid 字段上过滤的查询都可以使用 PK,而任何在 type 上过滤的查询都可以使用 idx_type,这意味着 idx_uid_type 只是为了强制唯一性。但我们可以像这样让它更有用:

  PRIMARY KEY (`uid`),
  UNIQUE KEY `idx_uid_type` (`type`,`uid`),
  KEY `idx_upload_date DESC` (`upload_date`)

我们去掉了一个索引!那应该使您的索引更快。您在 upload_date 上仍有一个索引未在此特定查询中使用。那么复合索引怎么样?

  PRIMARY KEY (`uid`),
  UNIQUE KEY `idx_uid_type` (`type`,`uid`),
  KEY `idx_upload_date DESC` (`uid`,`upload_date`)

首先,LEFT JOIN不是必须的。因此,您可以将查询编写为:

SELECT c.*
FROM ost_content c JOIN
     ost_categories2media c2m
     ON c2m.categories2media_uid = c.uid
WHERE c2m.categories2media_category_id = '53' AND
      c.conversion_status IN ('success', 'announcement') AND
      c.type IN ('serial', 'media')
ORDER BY c.upload_date DESC
LIMIT 16, 16;

很遗憾,您对内容 table 的条件并不简单 = 条件。如果是,建议在 ost_content(conversion_status, type, uid) 上建立索引。这可能仍然是更好的选择。

另一种选择是走另一条路:ost_categories2media(categories2media_category_id, categories2media_uid) 上的索引。

您可能会发现第一个复合索引和此查询效果最好:

SELECT c.*
FROM ((SELECT c.*
       FROM ost_content c JOIN
            ost_categories2media c2m
            ON c2m.categories2media_uid = c.uid
       WHERE c2m.categories2media_category_id = '53' AND
             c.conversion_status = 'success' AND
             c.type IN ('serial', 'media')
      ) UNION ALL
      (SELECT c.*
       FROM ost_content c JOIN
            ost_categories2media c2m
            ON c2m.categories2media_uid = c.uid
       WHERE c2m.categories2media_category_id = '53' AND
             c.conversion_status = 'announcement' AND
             c.type IN ('serial', 'media')
      ) 
     ) c
ORDER BY c.upload_date DESC
LIMIT 16, 16;

这看起来比较复杂,但是每个子查询都可以利用索引,所以它可能会提高性能。