您如何在一个查询中的多个子查询上正确使用多个 group_concats 而不区分?

How do you properly use multiple group_concats on multiple subqueries within one query without distinct?

我有一个相当可怕的查询,我正在为特定用户或多个用户提取信息。我需要通过 link table 为每个用户获取多组规范化属性。

我想将这些集合作为逗号分隔的列表拉回来,似乎 group_concat 就是为此而构建的,但我总是以大量额外的结果结束,并与每个额外的连接相结合。虽然我想我知道为什么,但我很难想象如何构建查询来避免这种情况。

我已经创建了我在 fiddle here 中尝试做的事情的简化版本。如果 link 永远不可用,这里是 DDL:

CREATE TABLE users (ID INT, name VARCHAR(30));

INSERT INTO users (ID, name)
VALUES (1, 'Jon');
INSERT INTO users (ID, name)
VALUES (2, 'Jane');

CREATE TABLE skills (ID INT, name VARCHAR(30), groupName VARCHAR(30));

INSERT INTO skills (ID, name, groupName)
VALUES (1, 'Drawing', 'Art');
INSERT INTO skills (ID, name, groupName)
VALUES (2, '3D Animation', 'Art');
INSERT INTO skills (ID, name, groupName)
VALUES (3, 'javaScript', 'Programming');
INSERT INTO skills (ID, name, groupName)
VALUES (4, 'HTML', 'Programming');

CREATE TABLE users2skills (UID INT, SID INT);

INSERT INTO users2skills (UID, SID)
VALUES (1, 3);
INSERT INTO users2skills (UID, SID)
VALUES (1, 4);
INSERT INTO users2skills (UID, SID)
VALUES (2, 1);
INSERT INTO users2skills (UID, SID)
VALUES (2, 2);
INSERT INTO users2skills (UID, SID)
VALUES (2, 4);

CREATE TABLE shifts (ID INT, name VARCHAR(30));

INSERT INTO shifts (ID, name)
VALUES (1, 'Daylight');
INSERT INTO shifts (ID, name)
VALUES (2, 'Evening');
INSERT INTO shifts (ID, name)
VALUES (3, 'Midnight');

CREATE TABLE users2shifts (UID INT, SID INT);
INSERT INTO users2shifts (UID, SID)
VALUES (1, 1);
INSERT INTO users2shifts (UID, SID)
VALUES (1, 2);
INSERT INTO users2shifts (UID, SID)
VALUES (2, 2);
INSERT INTO users2shifts (UID, SID)
VALUES (2, 3);

首先,这是我放在一起的一个简单查询,它提取了预期的结果:

select  u.ID,
        u.name,
        GROUP_CONCAT(skQ.ID order by skQ.ID) as skillList,
        GROUP_CONCAT(skQ.name order by skQ.ID) as skillDesc
        
from    users u
        left outer join (
                            select  u2sk.UID, sk.ID, sk.name
                            from    users2skills u2sk
                                    inner join skills sk    on sk.ID    = u2sk.SID
                            ) skQ on skQ.UID    = u.ID
group by u.ID,
        u.name

结果如下所示:

ID name skillList skillDesc
1  Jon  3,4    javaScript,HTML
2  Jane 1,2,4  Drawing,3D Animation,HTML

请注意,两个列表的顺序匹配,因此我可以稍后以编程方式将 ID 加入到描述中。但是,我需要额外的数据,包括技能的类别描述,并且想要一个如下所示的最终结果集:

ID name skillList skillDesc      skillGroups             shiftList shiftDesc
1  Jon  3,4      javaScript,HTML Programming,Programming 1,2     Daylight,Evening
2 Jane 1,2,4    Drawing,3D Animation,HTML Art,Art,Programming 2,3 Evening,Midnight

所以我为我想加入的新 table 添加了几个新的 group_concat 语句,创建了这个查询:

select  u.ID,
        u.name,
        GROUP_CONCAT(skQ.ID order by skQ.ID) as skillList,
        GROUP_CONCAT(skQ.name order by skQ.ID) as skillDesc,
        GROUP_CONCAT(skQ.groupName order by skQ.ID) as skillGroups,
        GROUP_CONCAT(shQ.ID order by skQ.ID) as shiftList,
        GROUP_CONCAT(shQ.name order by skQ.ID) as shiftDesc
        
from    users u
        left outer join (
                            select  u2sk.UID, sk.ID, sk.name, sk.groupName
                            from    users2skills u2sk
                                    inner join skills sk    on sk.ID    = u2sk.SID
                            ) skQ on skQ.UID    = u.ID
        left outer join (
                            select  u2sh.UID, sh.ID, sh.name
                            from    users2shifts u2sh
                                    inner join shifts sh    on sh.ID    = u2sh.SID
                            ) shQ on shQ.UID    = u.ID
group by u.ID,
        u.name

然而,返回的结果集是:

ID  name    skillList   skillDesc                                           skillGroups                                     shiftList       shiftDesc
1   Jon     3,3,4,4     javaScript,javaScript,HTML,HTML                     Programming,Programming,Programming,Programming     2,1,2,1     Evening,Daylight,Evening,Daylight
2   Jane    1,1,2,2,4,4 Drawing,Drawing,3D Animation,3D Animation,HTML,HTML Art,Art,Art,Art,Programming,Programming         3,2,3,2,3,2     Midnight,Evening,Midnight,Evening,Midnight,Evening

在我的实际查询中,这个问题成倍增加,每个列表包含 100 多个项目。我已经看到很多关于如何解决这个问题的问题,但我发现那些问题一般都会得到相同的答案:使用不同的。这给我带来了一个问题,因为我在“groupName”字段中有重复的值,这些值被不同的过滤掉了。这会产生不同大小的列表,并阻止我在取回数据后对数据进行任何操作。

如果可以的话,我宁愿不需要进行 10 个单独的查询,以及所有相关的开销和 Web 与数据库服务器之间的连接。我怀疑单个查询无论如何都会快得多。以我需要的格式获取我需要的数据的正确方法是什么?

对于主 table 中的给定行,left joined table 中有多个匹配项,因此行相乘,最终得到相同的值被聚合不止一次。

基本上,您需要在子查询中聚合:

select u.ID, u.name, skQ.skillList, skQ.skillDesc, sqQ.skillGroups, shQ.shiftList, shQ.shiftDesc
from users u
left outer join (
    select u2sk.UID, 
        GROUP_CONCAT(sk.ID        order by sk.ID) as skillList
        GROUP_CONCAT(sk.name      order by sk.ID) as skillDesc,
        GROUP_CONCAT(sk.groupName order by sk.ID) as skillGroups
    from users2skills u2sk
    inner join skills sk on sk.ID = u2sk.SID
    group by u2sk.uid
) skQ on skQ.UID  = u.ID
left outer join (
    select u2sh.UID, 
        GROUP_CONCAT(sh.ID   order by sh.ID) as shiftList,
        GROUP_CONCAT(sh.name order by sh.ID) as shiftDesc
    from users2shifts u2sh
    inner join shifts sh on sh.ID = u2sh.SID
) shQ on shQ.UID = u.ID