外部连接中的 VS ON

Question

我想知道当我们决定是否复制已经在 Where 子句中的条件时如何获得更好的 SQL 性能。

我的朋友说这取决于数据库引擎，但我不太确定。

无论什么数据库引擎，通常情况下，Where子句中的条件应该在join之前先执行，但我假设它是inner join而不是outer join。因为有些条件只能在外连接之后执行。

例如：

Select a.*, b.* 
From A a
Left outer join B on a.id = b.id 
Where b.id is NULL;

Where中的条件在外连接之前无法执行。

所以，我假设整个 ON 子句必须在 where 子句之前先执行，而且 ON 子句似乎将控制 table B 的大小（或者 table A 如果我们使用 right外部连接）在外部连接之前。对我来说，这似乎与数据库引擎无关。

这就提出了我的问题：当我们使用外连接时，我们是否应该始终在 ON 子句中复制我们的条件？

例如（我使用 table 与自身的较短版本进行外部连接）

temp_series_installment & series_id > 18940000 对比 temp_series_installment:

select sql_no_cache s.*, t.* from temp_series_installment s
left outer join temp_series_installment t on s.series_id = t.series_id and t.series_id > 18940000 and t.incomplete = 1
where t.incomplete = 1;

VS

select sql_no_cache s.*, t.* from temp_series_installment s
left outer join temp_series_installment t on s.series_id = t.series_id and t.series_id > 18940000
where t.incomplete = 1;

Edit：其中 t.incomplete = 1 执行以下逻辑：其中 t.series_id 不为空这是 Gordon Linoff 建议的内部联接但我一直在问的是：如果它外部加入一个较小的table，它应该更快吧？

我试着看看 mysql 中是否有任何性能差异：

但是出乎我的意料，为什么第二个更快？我认为通过外部连接一个较小的 table，查询会更快。

我的想法来自： https://www.ibm.com/support/knowledgecenter/en/SSZLC2_8.0.0/com.ibm.commerce.developer.doc/refs/rsdperformanceworkspaces.htm

部分：

尽可能将谓词推入 OUTER JOIN 子句
尽可能为不同的 table 复制常量条件

Answer 1

Regardless of DB engines, normally, the condition in Where clause should be executed first before join, but I assume it means inner join but not outer join. Because some conditions can only be executed AFTER outer join.

这根本不是真的。 SQL 是一种描述性 语言。 它没有指定查询是如何执行的。它只指定了结果集的样子。 SQL compiler/optimizer 确定实际处理步骤以满足查询描述的要求。

就语义而言，FROM 子句是"evaluated" 的第一个子句。因此，FROM 在 WHERE 子句之前进行逻辑处理。

你的问题的其余部分同样被误导了。 where子句中的比较逻辑，如：

from s left join
     t
    on s.series_id = t.series_id and t.series_id > 18940000
where t.incomplete = 1

将外连接变成内连接。所以逻辑和你想的不一样

Answer 2

正如 Gordon Lindolf 所指出的那样，你的朋友完全错了。

我想补充的是开发人员喜欢 SQL 就像他们认为他们的贸易语言（C++、VB、Java）一样，但那些是 procedural/imperative语言。当您编码 SQL 时，您处于另一种范式中。您只是在描述要应用于数据集的函数。

让我们举个例子：

Select a.*, b.* 
From A a
Left outer join B on a.id = b.id 
Where b.id is NULL;

如果 a.Id 和 b.Id 不是空列。

语义上等同于

Select a.*, null, ..., null
From A a
where not exists (select * from B b where b.Id = a.Id)

现在尝试运行查询和配置文件。在大多数 DBMS 中，我可以期望以完全相同的方式对运行进行两个查询。

这是因为引擎决定如何在数据集上实现您的 "function"。

注意上面的例子在集合数学中等价于：

Give me the set A minus the intersection between A and B.

引擎可以决定如何执行您的查询，因为它们有一些技巧。它有关于您的表、索引等的指标，并且可以使用它，例如，以您编写的不同顺序 "make a join"。

今天的恕我直言引擎真的很擅长找到实现您描述的功能的最佳方法，并且很少需要查询提示。当然，您可以以过于复杂的方式结束描述您的功能，从而影响引擎决定运行它的方式。更好地描述函数、集合和管理索引的艺术就是我们所说的查询调整。

外部连接中的 VS ON

Where vs ON in outer join

sql

join

outer-join