如何优化规范化数据库结构的查询？

Question

我正在尝试优化当前在 MySQL 5.x 数据库上需要 0.00x 秒的查询，以便在没有负载的系统上检索数据。

查询如下所示：

SELECT 
   a.article_id,
   GROUP_CONCAT(attr_f.attr_de) AS functions, 
   GROUP_CONCAT(attr_n.attr_de) AS miscellaneous
FROM `articles_test` a
LEFT JOIN articles_attr AS f ON a.article_id = f.article_id AND f.attr_group_id = 26
LEFT JOIN articles_attr AS attr ON a.article_id = attr.article_id AND attr.attr_group_id = 27
LEFT JOIN cat_attr AS attr_f ON attr_f.attr_id = f.attr_id
LEFT JOIN cat_attr AS attr_n ON attr_n.attr_id = attr.attr_id
WHERE a.article_id = 11

解释 returns

1   SIMPLE  a   
    NULL
    const   article_id  article_id  3   const   1   100.00  
    NULL

1   SIMPLE  f   
    NULL
    ref article_id_2,article_id article_id_2    6   const,const 2   100.00  Using index 
1   SIMPLE  attr    
    NULL
    ref article_id_2,article_id article_id_2    6   const,const 4   100.00  Using index 
1   SIMPLE  attr_f  
    NULL
    ref attr_id attr_id 3   test.f.attr_id  1   100.00  
    NULL

1   SIMPLE  attr_n  
    NULL
    ref attr_id attr_id 3   test.attr.attr_id   1   100.00  
    NULL

查询的所有字段都有索引。是否有另一种方法可以通过更简单、更快速的查询来检索数据？

CREATE TABLE `articles_attr` (
 `date_created` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `article_id` mediumint(8) unsigned NOT NULL,
 `attr_group_id` mediumint(8) NOT NULL,
 `attr_id` mediumint(8) unsigned DEFAULT NULL,
 `value` varchar(255) DEFAULT NULL,
 UNIQUE KEY `article_id_2` (`article_id`,`attr_group_id`,`attr_id`),
 KEY `article_id` (`article_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 

CREATE TABLE `cat_attr` (
 `attr_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
 `attr_group_id` mediumint(8) unsigned NOT NULL,
 `sort` tinyint(4) NOT NULL,
 `attr_de` varchar(255) NOT NULL,
 UNIQUE KEY `attr_id` (`attr_id`,`attr_group_id`),
 UNIQUE KEY `attr_group_id` (`attr_group_id`,`attr_de`)
) ENGINE=InnoDB AUTO_INCREMENT=380 DEFAULT CHARSET=utf8

CREATE TABLE `articles_test` (
 `article_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
 UNIQUE KEY `article_id` (`article_id`),
) ENGINE=InnoDB AUTO_INCREMENT=221614 DEFAULT CHARSET=latin1

Table articles_attr 包含大约 50 万行。

Answer 1

首先，对于这样的查询，9 毫秒已经不错了。没有根本的改进。您可能能够从查询中挤出一两毫秒，但您可能做不到。

您在 articles_attr 上的三列索引看起来不错。您可以尝试调换索引中前两列的顺序，看看是否可以获得更好的性能。

事实上，table 上的单列索引是不必要的：提供索引功能是因为同一列在三列索引中排在第一位。删除该索引可能不会帮助您提高查询性能，但会有助于提高插入性能。

GROUP_CONCAT() 在这里是有道理的。聚合整个结果集是完全有效的。为了清楚起见，您可以添加 GROUP BY a.article_id；它不会对性能产生任何影响，因为您已经只选择了该列的一个值。

在 cat_attr 上，(attr_id, attr_de) 上的复合索引可能会有所帮助。但这显然很小table，所以不会有太大帮助。

您需要 LEFT JOIN 操作才能将 articles_attr 加入 cat_attr 吗？或者，根据您的数据结构，articles_attr.attr_id 的每个值都能保证在 cat_attr.attr_id 中找到匹配项。如果您可以将这些 LEFT JOIN 操作更改为 JOINs，您可能会获得轻微的加速。

Answer 2

由于您的 WHERE 子句指定了 article_id 的值，因此没有必要让 select 子句 return 它。最好删除它，也是因为它不符合 SQL 标准，即如果你有聚合 (group_concat)，select 子句中的所有非聚合表达式都必须是在 group by 子句中。但是这样做（如您问题的第一个版本）会产生一些开销。所以最好删除它。

由于 WHERE 条件在主键上，您不需要 articles_test table 中的任何数据，您可以省略 articles_test table，并将 WHERE 条件放在外键上。

最后，还有一种笛卡尔连接，将 attr_f 中的每个命中与 attr_n 中的每个命中结合起来。这可能会导致 group_concat 输出中出现一些重复，并且会影响性能。

如果可以删除此类重复项，那么将查询分成几组可能会获得更好的性能：一组用于 function 输出，一组用于杂项输出。然后由 attr_group_id.

组成小组

这也将允许将外部联接转换为内部联接。

所以输出将是你所追求的非透视版本：

SELECT     attr.attr_group_id, GROUP_CONCAT(cat.attr_de) AS functions
FROM       articles_attr AS attr 
INNER JOIN cat_attr AS cat ON cat.attr_id = attr.attr_id
WHERE      attr.article_id = 11
       AND attr.attr_group_id IN (26, 27) 
GROUP BY   attr.attr_group_id

所以现在输出将有两行。第一列26的会在第二列列出功能，第一列27的会列杂项

是的，输出格式不同，但我认为您将能够重新编写使用此查询的代码，同时受益于性能提升（我希望如此）。

如果您需要旋转版本，请使用 case when 表达式：

SELECT     GROUP_CONCAT(CASE attr.attr_group_id WHEN 26 THEN cat.attr_de END) AS functions,
           GROUP_CONCAT(CASE attr.attr_group_id WHEN 27 THEN cat.attr_de END) AS miscellaneous
FROM       articles_attr AS attr 
INNER JOIN cat_attr AS cat ON cat.attr_id = attr.attr_id
WHERE      attr.article_id = 11
       AND attr.attr_group_id IN (26, 27)

Answer 3

`attr_id` mediumint(8) unsigned DEFAULT NULL,

为什么NULL？你不总是需要一个attr吗？我提出这个问题的原因是您在 articles_attr 上没有明确的 PRIMARY KEY。 NULL 阻止将 UNIQUE 密钥提升为 PK。改成NOT NULL，提升UNIQUEPK

KEY `article_id` (`article_id`)

多余，放弃。

many:many table 的结构是次优的。几个小技巧：http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

如果不需要"many:many"，切换到“1:many”；效率更高。

您可能可以使用 JOIN 而不是 LEFT JOIN，因为您需要一路到达 attr_f 和 attr_n。

将 Group_concats 的连接移动到 SELECT 可能帮助：

SELECT  a.article_id, 
        (
        SELECT  GROUP_CONCAT(ca.attr_de)
            FROM  articles_attr AS aa
            JOIN  cat_attr AS ca USING(attr_id)
            WHERE  aa.attr_group_id = 26
              AND  aa.article_id = a.article_id
        ) AS functions, 
        (
        SELECT  GROUP_CONCAT(attr_f.attr_de)
            FROM  ..
            JOIN  ..
            WHERE  .. 
        ) AS miscellaneous
    FROM  `articles_test` a
    WHERE  a.article_id = 11

但也许最重要的是避免通过规范化属性使本已糟糕的 EAV 模式设计变得更糟！。即去掉tablecat_attr，把attr_de移到articles_attr。这将使 JOINs.

的数量减少一半

如何优化规范化数据库结构的查询？

How to optimize query on normalized database structure?

mysql

query-performance

entity-attribute-value