如何防止 CASE WHEN x THE(子查询)中的依赖子查询

How to prevent dependant subqueries within CASE WHEN x THE (subquery)

我有一个非常复杂的查询,它在 CASE 语句中使用了一些子查询。


所以这个 post 使用伪代码来处理。如果需要,我可以 post 查询,但它是一个怪物,对这个问题没有用。

我想要的是 CASE 语句中的可缓存子查询。

SELECT * FROM posts posts
INNER JOIN posts_shared_to shared_to
      ON shared_to.post_id = posts.post_id
INNER JOIN channels.channels 
      ON channels.channel_id = shared_to.channel_id
WHERE posts.parent_id IS NULL
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE) 
    WHEN channel.read_access IS NULL THEN 1
    WHEN channel.read_access = 1 THEN
      SELECT count(*) FROM channel_users 
      WHERE user_id = XXX AND channel_id = channels.channel_id
    WHEN shared_to.read_type = 2 THEN
      /* another subquery with a join */
      /* check if user is in friendlist of post_author */
   ELSE 0
GROUP BY post.post_id
ORDER BY post.post_id


MySql EXPLAIN 表示 CASE 中所有使用的子查询都是依赖的,这意味着(如果我是正确的)它们每次都需要 运行 并且不会被缓存。


编辑部分: 现在真正的查询看起来像这样:

SELECT a.id, a.title, a.message AS post_text, a.type, a.date, a.author AS uid, 
b.a_name as name, b.avatar, 
shared_to.to_circle AS circle_id, shared_to.root_circle,
c.circle_name, c.read_access, c.owner_uid, c.profile,
MATCH(a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) AS score

FROM posts a 

/** get userdetails for post_author **/
JOIN authors b ON b.id = a.author

/** get circles posts was shared to **/
JOIN posts_shared_to shared_to ON shared_to.post_id = a.id AND shared_to.deleted IS NULL

* get circle_details note: at the moment shared_to can contain NULL and 1 too and doesnt need to be a circle_id 
* if to_circle IS NULL post was shared public
* if to_circle = 1 post was shared to private circles
* since we use md5 keys as circle ids this can be a string insetad of (int) ... ugly.. 
LEFT JOIN circles c ON c.circle_id = shared_to.to_circle 
    /*AND c.circle_name IS NOT NULL */
    AND ( c.profile IS NULL OR c.profile = 6 OR c.profile = 1 ) 
    AND c.deleted IS NULL

    /** if post is within a channel that requires membership we use this to check if requesting user is member **/
    SELECT COUNT(*) users_count, user_id, circle_id FROM circles_users
    GROUP BY user_id, circle_id
    ) counts ON counts.circle_id = shared_to.to_circle
             AND counts.user_id = :me

    /** if post is shared private we check if requesting users exists within post authors private circles **/
    SELECT count(*) in_circles_count, ci.owner_uid AS circle_owner, cu1.user_id AS user_me 
    FROM circles ci 
    INNER JOIN circles_users cu1 ON cu1.circle_id = ci.circle_id 
                                 AND cu1.deleted IS NULL 
    WHERE ci.profile IS NULL AND ci.deleted IS NULL
    GROUP BY user_me, circle_owner
) users_in_circles ON users_in_circles.user_me = :me 
                   AND users_in_circles.circle_owner = a.id

/** make sure post is a topic **/
WHERE a.parent_id IS NULL AND a.deleted IS NULL

/** search title and post body **/
AND MATCH (a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) 

    /** own circle **/
    c.owner_uid = :me
    /** site member read_access ( this query is for members, for guests we use a different query ) **/
    OR ( c.read_access = 1 OR c.read_access = "1" )
    /** public read_access **/
    OR ( shared_to.to_circle IS NULL OR ( c.read_access IS NULL AND c.owner_uid IS NOT NULL ) )
    /** channel/circle member read_access**/
    OR ( c.read_access = 3 OR c.read_access = "3" AND counts.users_count > 0 )
    /** for users within post creators private circles **/
    OR ( 
    /** use shared_to to determine if post is private **/
    shared_to.to_circle = "1" OR shared_to.to_circle = 1 
    /** use circle settings to determine global privacy **/
    OR ( c.owner_uid IS NOT NULL AND c.read_access = 2 OR c.read_access = "2" )
    ) AND users_in_circles.circle_owner = a.author AND users_in_circles.user_me = :me


问题: 这真的是更好的方法吗?如果我查看派生表可以包含多少行,我不确定。

也许有人可以像@Ollie-Jones 提到的那样帮助我更改查询:

SELECT stuff, stuff, stuff
  FROM (
         SELECT post.post_id
           FROM your whole query
          ORDER BY post_id DESC
          LIMIT n,n
       ) ids
  JOIN whatever ON whatever.post_id = ids.post_id
  JOIN whatelse ON whatelse

对不起,如果这听起来很懒惰,但我不是一个真正的 mysqlguy,多年来我一直为构建这个查询而头疼。 :D

消除依赖子查询的最佳方法是重构它,使其成为一个虚拟的 table(一个独立的子查询),然后将其连接或左连接到您的其余 table。


     SELECT count(*) FROM channel_users 
      WHERE user_id = XXX AND channel_id = channels.channel_id


                   SELECT COUNT(*) users_count,
                          user_id, channel_id
                    FROM channel_users
                   GROUP BY user_id, channel_id

您看到虚拟 table 如何为 user_idchannel_id 值的每个不同组合包含一行吗?每行都有您需要的 users_count 值。然后,您可以像这样将其 JOIN 到您的其余查询中。 (请注意 MySQL 中的 INNER JOIN === JOIN,因此我使用 JOIN 将其缩短了一点。)

SELECT * FROM posts posts
  JOIN posts_shared_to shared_to ON shared_to.post_id = posts.post_id
  JOIN channels.channels  ON channels.channel_id = shared_to.channel_id
                   SELECT COUNT(*) users_count,
                          user_id, channel_id
                    FROM channel_users
                   GROUP BY user_id, channel_id
       ) counts ON counts.channel_id = shared_to.channel_id
               AND counts.user_id = channels.user_id
  LEFT JOIN (  /* your other refactored subquery */
            ) friendcounts ON whatever
 WHERE posts.parent_id IS NULL
   AND channels.user_id = XXX
   AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE) 
   AND (          channel.read_access IS NULL
               OR (channel.read_access = 1 AND counts.users_count > 0)
               OR (shared_to.read_type = AND friendcount.users_count > 0)
 GROUP BY post.post_id
 ORDER BY post.post_id DESC
 LIMIT n,n

MySQL 查询规划器通常足够聪明,可以为每个独立的子查询生成适当的子集。

专业提示: SELECT lots of columns ... ORDER BY something LIMIT n 通常被认为是一种浪费的反模式。它会降低性能,因为它会对一大堆数据列进行排序,然后丢弃大部分结果。

亲提示:SELECT *在JOIN查询中也是浪费。如果您提供结果集中实际需要的列的列表,您的情况会好得多。


    SELECT stuff, stuff, stuff
      FROM (
             SELECT post.post_id
               FROM your whole query
              ORDER BY post_id DESC
              LIMIT n,n
           ) ids
      JOIN whatever ON whatever.post_id = ids.post_id
      JOIN whatelse ON whatelse.

想法是仅对 post_id 值进行排序,然后使用 LIMITed 子集提取您需要的其余数据。