如何防止 CASE WHEN x THE(子查询)中的依赖子查询
How to prevent dependant subqueries within CASE WHEN x THE (subquery)
我有一个非常复杂的查询,它在 CASE 语句中使用了一些子查询。
对于这个问题,不需要完整的查询,它只会阻止人们快速解决问题。
所以这个 post 使用伪代码来处理。如果需要,我可以 post 查询,但它是一个怪物,对这个问题没有用。
我想要的是 CASE 语句中的可缓存子查询。
SELECT * FROM posts posts
INNER JOIN posts_shared_to shared_to
ON shared_to.post_id = posts.post_id
INNER JOIN channels.channels
ON channels.channel_id = shared_to.channel_id
WHERE posts.parent_id IS NULL
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE)
AND CASE(
WHEN channel.read_access IS NULL THEN 1
WHEN channel.read_access = 1 THEN
(
SELECT count(*) FROM channel_users
WHERE user_id = XXX AND channel_id = channels.channel_id
)
WHEN shared_to.read_type = 2 THEN
(
/* another subquery with a join */
/* check if user is in friendlist of post_author */
)
ELSE 0
END;
)
GROUP BY post.post_id
ORDER BY post.post_id
DESC LIMIT n,n
如上所述,这只是一个简化的伪代码。
MySql EXPLAIN 表示 CASE 中所有使用的子查询都是依赖的,这意味着(如果我是正确的)它们每次都需要 运行 并且不会被缓存。
欢迎任何有助于加快此查询速度的解决方案。
编辑部分:
现在真正的查询看起来像这样:
SELECT a.id, a.title, a.message AS post_text, a.type, a.date, a.author AS uid,
b.a_name as name, b.avatar,
shared_to.to_circle AS circle_id, shared_to.root_circle,
c.circle_name, c.read_access, c.owner_uid, c.profile,
MATCH(a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) AS score
FROM posts a
/** get userdetails for post_author **/
JOIN authors b ON b.id = a.author
/** get circles posts was shared to **/
JOIN posts_shared_to shared_to ON shared_to.post_id = a.id AND shared_to.deleted IS NULL
/**
* get circle_details note: at the moment shared_to can contain NULL and 1 too and doesnt need to be a circle_id
* if to_circle IS NULL post was shared public
* if to_circle = 1 post was shared to private circles
* since we use md5 keys as circle ids this can be a string insetad of (int) ... ugly..
*
**/
LEFT JOIN circles c ON c.circle_id = shared_to.to_circle
/*AND c.circle_name IS NOT NULL */
AND ( c.profile IS NULL OR c.profile = 6 OR c.profile = 1 )
AND c.deleted IS NULL
LEFT JOIN (
/** if post is within a channel that requires membership we use this to check if requesting user is member **/
SELECT COUNT(*) users_count, user_id, circle_id FROM circles_users
GROUP BY user_id, circle_id
) counts ON counts.circle_id = shared_to.to_circle
AND counts.user_id = :me
LEFT JOIN (
/** if post is shared private we check if requesting users exists within post authors private circles **/
SELECT count(*) in_circles_count, ci.owner_uid AS circle_owner, cu1.user_id AS user_me
FROM circles ci
INNER JOIN circles_users cu1 ON cu1.circle_id = ci.circle_id
AND cu1.deleted IS NULL
WHERE ci.profile IS NULL AND ci.deleted IS NULL
GROUP BY user_me, circle_owner
) users_in_circles ON users_in_circles.user_me = :me
AND users_in_circles.circle_owner = a.id
/** make sure post is a topic **/
WHERE a.parent_id IS NULL AND a.deleted IS NULL
/** search title and post body **/
AND MATCH (a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE)
AND (
/** own circle **/
c.owner_uid = :me
/** site member read_access ( this query is for members, for guests we use a different query ) **/
OR ( c.read_access = 1 OR c.read_access = "1" )
/** public read_access **/
OR ( shared_to.to_circle IS NULL OR ( c.read_access IS NULL AND c.owner_uid IS NOT NULL ) )
/** channel/circle member read_access**/
OR ( c.read_access = 3 OR c.read_access = "3" AND counts.users_count > 0 )
/** for users within post creators private circles **/
OR (
(
/** use shared_to to determine if post is private **/
shared_to.to_circle = "1" OR shared_to.to_circle = 1
/** use circle settings to determine global privacy **/
OR ( c.owner_uid IS NOT NULL AND c.read_access = 2 OR c.read_access = "2" )
) AND users_in_circles.circle_owner = a.author AND users_in_circles.user_me = :me
)
)
GROUP BY a.id ORDER BY a.id DESC LIMIT n,n
问题:
这真的是更好的方法吗?如果我查看派生表可以包含多少行,我不确定。
也许有人可以像@Ollie-Jones 提到的那样帮助我更改查询:
SELECT stuff, stuff, stuff
FROM (
SELECT post.post_id
FROM your whole query
ORDER BY post_id DESC
LIMIT n,n
) ids
JOIN whatever ON whatever.post_id = ids.post_id
JOIN whatelse ON whatelse
对不起,如果这听起来很懒惰,但我不是一个真正的 mysqlguy,多年来我一直为构建这个查询而头疼。 :D
消除依赖子查询的最佳方法是重构它,使其成为一个虚拟的 table(一个独立的子查询),然后将其连接或左连接到您的其余 table。
在你的案例中,你有
SELECT count(*) FROM channel_users
WHERE user_id = XXX AND channel_id = channels.channel_id
所以,独立子查询转换为
SELECT COUNT(*) users_count,
user_id, channel_id
FROM channel_users
GROUP BY user_id, channel_id
您看到虚拟 table 如何为 user_id
和 channel_id
值的每个不同组合包含一行吗?每行都有您需要的 users_count
值。然后,您可以像这样将其 JOIN 到您的其余查询中。 (请注意 MySQL 中的 INNER JOIN === JOIN,因此我使用 JOIN 将其缩短了一点。)
SELECT * FROM posts posts
JOIN posts_shared_to shared_to ON shared_to.post_id = posts.post_id
JOIN channels.channels ON channels.channel_id = shared_to.channel_id
LEFT JOIN (
SELECT COUNT(*) users_count,
user_id, channel_id
FROM channel_users
GROUP BY user_id, channel_id
) counts ON counts.channel_id = shared_to.channel_id
AND counts.user_id = channels.user_id
LEFT JOIN ( /* your other refactored subquery */
) friendcounts ON whatever
WHERE posts.parent_id IS NULL
AND channels.user_id = XXX
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE)
AND ( channel.read_access IS NULL
OR (channel.read_access = 1 AND counts.users_count > 0)
OR (shared_to.read_type = AND friendcount.users_count > 0)
)
GROUP BY post.post_id
ORDER BY post.post_id DESC
LIMIT n,n
MySQL 查询规划器通常足够聪明,可以为每个独立的子查询生成适当的子集。
专业提示: SELECT lots of columns ... ORDER BY something LIMIT n
通常被认为是一种浪费的反模式。它会降低性能,因为它会对一大堆数据列进行排序,然后丢弃大部分结果。
亲提示:SELECT *
在JOIN查询中也是浪费。如果您提供结果集中实际需要的列的列表,您的情况会好得多。
所以,您可以再次重构您的查询来做
SELECT stuff, stuff, stuff
FROM (
SELECT post.post_id
FROM your whole query
ORDER BY post_id DESC
LIMIT n,n
) ids
JOIN whatever ON whatever.post_id = ids.post_id
JOIN whatelse ON whatelse.
想法是仅对 post_id
值进行排序,然后使用 LIMITed 子集提取您需要的其余数据。
我有一个非常复杂的查询,它在 CASE 语句中使用了一些子查询。
对于这个问题,不需要完整的查询,它只会阻止人们快速解决问题。
所以这个 post 使用伪代码来处理。如果需要,我可以 post 查询,但它是一个怪物,对这个问题没有用。
我想要的是 CASE 语句中的可缓存子查询。
SELECT * FROM posts posts
INNER JOIN posts_shared_to shared_to
ON shared_to.post_id = posts.post_id
INNER JOIN channels.channels
ON channels.channel_id = shared_to.channel_id
WHERE posts.parent_id IS NULL
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE)
AND CASE(
WHEN channel.read_access IS NULL THEN 1
WHEN channel.read_access = 1 THEN
(
SELECT count(*) FROM channel_users
WHERE user_id = XXX AND channel_id = channels.channel_id
)
WHEN shared_to.read_type = 2 THEN
(
/* another subquery with a join */
/* check if user is in friendlist of post_author */
)
ELSE 0
END;
)
GROUP BY post.post_id
ORDER BY post.post_id
DESC LIMIT n,n
如上所述,这只是一个简化的伪代码。
MySql EXPLAIN 表示 CASE 中所有使用的子查询都是依赖的,这意味着(如果我是正确的)它们每次都需要 运行 并且不会被缓存。
欢迎任何有助于加快此查询速度的解决方案。
编辑部分: 现在真正的查询看起来像这样:
SELECT a.id, a.title, a.message AS post_text, a.type, a.date, a.author AS uid,
b.a_name as name, b.avatar,
shared_to.to_circle AS circle_id, shared_to.root_circle,
c.circle_name, c.read_access, c.owner_uid, c.profile,
MATCH(a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) AS score
FROM posts a
/** get userdetails for post_author **/
JOIN authors b ON b.id = a.author
/** get circles posts was shared to **/
JOIN posts_shared_to shared_to ON shared_to.post_id = a.id AND shared_to.deleted IS NULL
/**
* get circle_details note: at the moment shared_to can contain NULL and 1 too and doesnt need to be a circle_id
* if to_circle IS NULL post was shared public
* if to_circle = 1 post was shared to private circles
* since we use md5 keys as circle ids this can be a string insetad of (int) ... ugly..
*
**/
LEFT JOIN circles c ON c.circle_id = shared_to.to_circle
/*AND c.circle_name IS NOT NULL */
AND ( c.profile IS NULL OR c.profile = 6 OR c.profile = 1 )
AND c.deleted IS NULL
LEFT JOIN (
/** if post is within a channel that requires membership we use this to check if requesting user is member **/
SELECT COUNT(*) users_count, user_id, circle_id FROM circles_users
GROUP BY user_id, circle_id
) counts ON counts.circle_id = shared_to.to_circle
AND counts.user_id = :me
LEFT JOIN (
/** if post is shared private we check if requesting users exists within post authors private circles **/
SELECT count(*) in_circles_count, ci.owner_uid AS circle_owner, cu1.user_id AS user_me
FROM circles ci
INNER JOIN circles_users cu1 ON cu1.circle_id = ci.circle_id
AND cu1.deleted IS NULL
WHERE ci.profile IS NULL AND ci.deleted IS NULL
GROUP BY user_me, circle_owner
) users_in_circles ON users_in_circles.user_me = :me
AND users_in_circles.circle_owner = a.id
/** make sure post is a topic **/
WHERE a.parent_id IS NULL AND a.deleted IS NULL
/** search title and post body **/
AND MATCH (a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE)
AND (
/** own circle **/
c.owner_uid = :me
/** site member read_access ( this query is for members, for guests we use a different query ) **/
OR ( c.read_access = 1 OR c.read_access = "1" )
/** public read_access **/
OR ( shared_to.to_circle IS NULL OR ( c.read_access IS NULL AND c.owner_uid IS NOT NULL ) )
/** channel/circle member read_access**/
OR ( c.read_access = 3 OR c.read_access = "3" AND counts.users_count > 0 )
/** for users within post creators private circles **/
OR (
(
/** use shared_to to determine if post is private **/
shared_to.to_circle = "1" OR shared_to.to_circle = 1
/** use circle settings to determine global privacy **/
OR ( c.owner_uid IS NOT NULL AND c.read_access = 2 OR c.read_access = "2" )
) AND users_in_circles.circle_owner = a.author AND users_in_circles.user_me = :me
)
)
GROUP BY a.id ORDER BY a.id DESC LIMIT n,n
问题: 这真的是更好的方法吗?如果我查看派生表可以包含多少行,我不确定。
也许有人可以像@Ollie-Jones 提到的那样帮助我更改查询:
SELECT stuff, stuff, stuff
FROM (
SELECT post.post_id
FROM your whole query
ORDER BY post_id DESC
LIMIT n,n
) ids
JOIN whatever ON whatever.post_id = ids.post_id
JOIN whatelse ON whatelse
对不起,如果这听起来很懒惰,但我不是一个真正的 mysqlguy,多年来我一直为构建这个查询而头疼。 :D
消除依赖子查询的最佳方法是重构它,使其成为一个虚拟的 table(一个独立的子查询),然后将其连接或左连接到您的其余 table。
在你的案例中,你有
SELECT count(*) FROM channel_users
WHERE user_id = XXX AND channel_id = channels.channel_id
所以,独立子查询转换为
SELECT COUNT(*) users_count,
user_id, channel_id
FROM channel_users
GROUP BY user_id, channel_id
您看到虚拟 table 如何为 user_id
和 channel_id
值的每个不同组合包含一行吗?每行都有您需要的 users_count
值。然后,您可以像这样将其 JOIN 到您的其余查询中。 (请注意 MySQL 中的 INNER JOIN === JOIN,因此我使用 JOIN 将其缩短了一点。)
SELECT * FROM posts posts
JOIN posts_shared_to shared_to ON shared_to.post_id = posts.post_id
JOIN channels.channels ON channels.channel_id = shared_to.channel_id
LEFT JOIN (
SELECT COUNT(*) users_count,
user_id, channel_id
FROM channel_users
GROUP BY user_id, channel_id
) counts ON counts.channel_id = shared_to.channel_id
AND counts.user_id = channels.user_id
LEFT JOIN ( /* your other refactored subquery */
) friendcounts ON whatever
WHERE posts.parent_id IS NULL
AND channels.user_id = XXX
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE)
AND ( channel.read_access IS NULL
OR (channel.read_access = 1 AND counts.users_count > 0)
OR (shared_to.read_type = AND friendcount.users_count > 0)
)
GROUP BY post.post_id
ORDER BY post.post_id DESC
LIMIT n,n
MySQL 查询规划器通常足够聪明,可以为每个独立的子查询生成适当的子集。
专业提示: SELECT lots of columns ... ORDER BY something LIMIT n
通常被认为是一种浪费的反模式。它会降低性能,因为它会对一大堆数据列进行排序,然后丢弃大部分结果。
亲提示:SELECT *
在JOIN查询中也是浪费。如果您提供结果集中实际需要的列的列表,您的情况会好得多。
所以,您可以再次重构您的查询来做
SELECT stuff, stuff, stuff
FROM (
SELECT post.post_id
FROM your whole query
ORDER BY post_id DESC
LIMIT n,n
) ids
JOIN whatever ON whatever.post_id = ids.post_id
JOIN whatelse ON whatelse.
想法是仅对 post_id
值进行排序,然后使用 LIMITed 子集提取您需要的其余数据。