group_concat 左连接太慢了

group_concat with left join is way too slow

我有一个歌曲库和一个工作查询,将多个 table 和 returns 所有 song_id、song_title 等连接成一个大的 table 像这样:

song_id             song_title          song_vocal          song_composer
--------------------------------------------------------------------------
1                   Hello Boy           John                Mike
2                   Hello Girl          Jim                 Dave
2                   Hello Girl          Tom                 Dave

我想将多首 vocal/composer 的歌曲合并成一行,如下所示:

song_id             song_title          song_vocal          song_composer
--------------------------------------------------------------------------
1                   Hello Boy           John                Mike
2                   Hello Girl          Jim,Tom             Dave

我用 group_concat() 函数做到了,但是速度太慢了。没有 group_concat,我的查询只需要 0.06 秒。使用该功能,需要 3 秒。这是我的查询:

SELECT
songs.song_id,
songs.title as Song_Title,
group_concat(DISTINCT a1.artist_name ORDER BY a1.artist_name DESC SEPARATOR ',' ) as Vocals,
group_concat(DISTINCT a2.artist_name ORDER BY a2.artist_name DESC SEPARATOR ',' ) as Composers    

FROM songs

left join song_vocals ON song_vocals.song_id = songs.song_id
left join artists a1 ON song_vocals.artist_id = a1.artist_id

left join song_composers on songs.song_id = song_composers.song_id
left join artists a2  on a2.artist_id = song_composers.artist_id

GROUP BY songs.song_id

我做错了什么?

是的,当然这太慢了。您正在生成笛卡尔积,然后删除重复项。

我建议用相关子查询替换逻辑:

select s.song_id, s.title as Song_Title,
       (select group_concat(a.artist_name order by a.artist_name DESC separator ',' ) 
        from song_vocals sv join 
             artists a 
             on sv.artist_id = a.artist_id
        where sv.song_id = s.song_id
       ) as Vocals,
       (select group_concat(a.artist_name order by a.artist_name desc separator ',' ) 
        from song_composers sc join 
             artists a 
             on sc.artist_id = a.artist_id
        where sc.song_id = s.song_id
       ) as Composers    
from songs s;

这也可以利用 song_composers(song_id, artist_id)song_vocals(song_id, artist_id) 上的索引。我还从 group_concat() 中删除了 distinct。应该不再需要了。

在处理多个多对多关系时,几乎总是要分别(子)查询然后合并这些关系。

SELECT s.song_id, s.title as Song_Title, v.Vocals, c.Composers    
FROM songs AS s
LEFT JOIN (
   SELECT song_id
        , GROUP_CONCAT(DISTINCT a.artist_name 
                       ORDER BY a.artist_name DESC 
                       SEPARATOR ',' 
          ) as Vocals
   FROM song_vocals AS sv
   LEFT JOIN artists AS a ON sv.artist_id = a.artist_id
   GROUP BY sv.song_id
) AS v ON s.song_id = v.song_id
LEFT JOIN (
   SELECT sc.song_id
        , GROUP_CONCAT(DISTINCT a.artist_name 
                       ORDER BY a.artist_name DESC 
                       SEPARATOR ',' 
          ) as Composers
   FROM song_composers AS sc
   LEFT JOIN artists AS a ON sc.artist_id = a.artist_id
   GROUP BY sc.song_id
) AS c ON s.song_id = c.song_id
;