如何消除“.. where X in (S) or Y in (S)”中的重复子查询？

Question

我有一个查询，我需要从 table 中获取行，其中两个外键中的任何一个存在于另一个查询中。这是简化的 SQL:

Select MainID From MainTable Where
Key1 In (Select SubID From SubTable Where UserID=@UserID) Or
Key2 In (Select SubID From SubTable Where UserID=@UserID)

如您所见，子查询重复了。 SQL 编译器是否足够智能以识别此和运行子查询一次还是运行两次？

有没有更好的写法SQL？

更新：我本来应该提到这个 - SubID 是 SubTable 的主键。

Answer 1

我认为编译器不够智能，无法进行一次 table 扫描或索引查找。

如果您有一个复杂的 where clause，那么您可以将 sub-query 结果推入 temp table。现在在 where 子句中使用 temp table 会有更好的性能。

SELECT SubID
INTO   #SubTable
FROM   SubTable
WHERE  UserID = @UserID

SELECT MainID
FROM   MainTable M
WHERE  EXISTS (SELECT 1
               FROM   #SubTable
               WHERE  M.Key1 = S.SubID)
        OR EXISTS (SELECT 1
                   FROM   #SubTable
                   WHERE  M.Key2 = S.SubID)

Answer 2

您可以使用常见的 table 表达式：

with subid_data as (
  Select SubID 
  From SubTable 
  Where UserID=@UserID
)
Select MainID 
From MainTable 
Where Key1 In (select SubID from subid_data)  
   Or Key2 In (select SubID from subid_data);

Answer 3

您可以将 IN 子句替换为 EXISTS 子句：

Select MainID From MainTable 
Where Exists
(
  Select * 
  From SubTable 
  Where UserID = @UserID 
  And SubID in (MainTable.Key1, MainTable.Key2)
);

Answer 4

请尝试以下查询：

Select MainID 
From MainTable m
Where exists 
( select 1 from SubTable s Where s.UserID=@UserID and s.sub_id in (m.key1,m.Key2))

Answer 5

tldr; 原始和后续的 JOIN 提议，少了 "looks redundant"，应该生成等效的查询计划。 View the actual query plans if there are any doubts as to how SQL Server is [currently] treating a query. (See IN vs. JOIN vs. EXISTS 体验魔法。）

Is the SQL compiler intelligent enough to recognize this and run the sub-query once only or does it run twice?

是的，SQL服务器足够智能来处理这个问题。 它不需要"run twice"（nit：子查询在程序意义上根本不需要"run"）。也就是说，没有强制性的显式物化阶段——更不用说两个了。下面的 JOIN 转换显示了为什么不需要这样。

因为这些是独立（或不相关）的子查询¹，因为它们不依赖于外部查询，然后他们可以 - 我敢说将会 - 被优化，因为他们可以自由，轻松地在 Relational Algebra (RA) 规则下移动。

As you can see, the sub-query is duplicated .. Is there a better way I can write this SQL?

但是它在视觉上仍然 "looks redundant" 因为它是这样写的。 SQL 服务器不关心 - 但人类可能会关心。因此下面是 I 的写法以及 I 考虑的 "better".

我非常喜欢在子查询上使用 JOIN；一旦采用 JOIN 方法，它通常 "fits better" 与 RA。由于原始子查询的非相关性质，这种对 JOIN 的简单转换是可能的——[SQL Server] 查询规划器能够在内部进行此类 RA 重写； 查看实际的查询计划 看看有什么不同，如果有的话。

重写查询将是：

Select MainID
From MainTable 
Join (
    Select Distinct SubID -- SubId must be unique from select
    From SubTable
    Where UserID=@UserID
) t
-- Joining on "A or B" may indicate an ARC relationship
-- but this obtains the original results
On    Key1 = t.SubID
   Or Key2 = t.SubID

将 DISTINCT 添加到 derived table query because of the unknown (to me) multiplicity of SubId column - it can be treated as a redundant qualifier by SQL Server if SubId is bound by a Unique Constraint so it's either required or "free". See IN vs. JOIN with large rowsets 是因为加入的 table 键是唯一的。

注意：SQL 服务器不一定要像上面那样重写一个 IN 到连接中，如 IN vs. JOIN vs. EXISTS 中所讨论的；但是能够移动 RA 操作（并且能够将查询视为 what 而不是 how 的基本概念仍然存在用过。

¹ 一些答案将原始子查询更改为 dependent/correlated 子查询 即 going the wrong way。它可能仍会导致 respectable（或什至等效）查询计划，因为 SQL 服务器将尝试 "undo" 更改 - 但这距离干净的 RA 模型和 JOIN 有一步之遥！（如果 SQL 服务器不能 "undo" 添加的相关性，那么查询将差很多。）

如何消除“.. where X in (S) or Y in (S)”中的重复子查询？

How to eliminate duplicate of subquery in ".. where X in (S) or Y in (S)"?

sql

sql-server

subquery