where 子句中的非相关 exists(...) 查询是针对每一行执行还是仅执行一次?
Is a non-correlated exists(...) query in the where clause executed for each row or just once?
我有以下查询:
SELECT *
FROM t1, t2
WHERE t1.some_id = t2.some_id
and
not exists(select true from some_table where some_column = true)
这里,not exists(select true from some_table where ...)
不是指主查询中的 t1
或 t2
。
这个 not exists(select true from some_table where ...)
是 只执行一次 还是 对于 t1
和 [ 之间的产品中的每个元组 =14=] 哪个 t1.some_id = t2.some_id
是正确的?
也就是说,如果只执行一次,如果结果为false,可以立即返回一个空的table。从逻辑上讲,我们写了:
if (not exists(select true from some_table where some_column = true))
SELECT *
FROM t1, t2
WHERE t1.some_id = t2.some_id
else
empty table
首先,学习使用正确的join
语法:
SELECT *
FROM t1 JOIN
t2
ON t1.some_id = t2.some_id
WHERE not exists(select true from some_table where some_column = true);
子查询应该只执行一次。然而,最终这取决于 Postgres 优化器。您可以使用 LEFT JOIN
:
保证代码只执行一次
SELECT t1.*, t2.*
FROM t1 JOIN
t2
ON t1.some_id = t2.some_id LEFT JOIN
some_table st
ON st.some_column = true
WHERE st.some_column IS NULL;
您甚至可以将 EXISTS() 条件包装在 纯 SQL 函数中,它只会执行一次:
-- \i tmp.sql
CREATE TABLE omg
( id integer NOT NULL PRIMARY KEY
, must_pay integer NOT NULL
);
INSERT INTO omg(id, must_pay) VALUES(1,0);
CREATE FUNCTION owe_money() RETURNS BOOLEAN AS
$func$
SELECT EXISTS(SELECT 1
FROM omg o
WHERE o.must_pay > 0
);
$func$
-- language sql;
language sql STABLE;
EXPLAIN
SELECT owe_money();
INSERT INTO omg(id, must_pay) VALUES(2,100);
EXPLAIN
SELECT owe_money();
EXPLAIN
SELECT * FROM omg
WHERE owe_money();
如果将关键字 STABLE
添加到函数定义中,DBMS 将知道 return 值 不会 更改(在相同的 transaction/statement).如果没有 STABLE
,该函数将为每一行调用一次。
QUERY PLAN
-------------------------------------------------------------------------------------------------
Seq Scan on omg (cost=0.00..566.40 rows=713 width=8) (actual time=0.303..0.331 rows=2 loops=1)
Filter: owe_money()
Total runtime: 0.384 ms
(3 rows)
STABLE 函数的结果是一次性过滤器:
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Result (cost=0.25..31.65 rows=2140 width=8) (actual time=0.304..0.316 rows=2 loops=1)
One-Time Filter: owe_money()
-> Seq Scan on omg (cost=0.25..31.65 rows=2140 width=8) (actual time=0.004..0.009 rows=2 loops=1)
Total runtime: 0.379 ms
(4 行)
并且与不相关子查询相同的条件本质上产生与 STABLE 函数相同的计划(现在子查询执行一次,其结果([=17=]
)在一次性过滤器中测试):
EXPLAIN ANALYZE
SELECT * FROM omg
WHERE EXISTS (
SELECT 1
FROM omg o
WHERE o.must_pay > 0
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Result (cost=0.05..31.45 rows=2140 width=8) (actual time=0.022..0.034 rows=2 loops=1)
One-Time Filter: [=13=]
InitPlan 1 (returns [=13=])
-> Seq Scan on omg o (cost=0.00..36.75 rows=713 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Filter: (must_pay > 0)
Rows Removed by Filter: 1
-> Seq Scan on omg (cost=0.00..31.40 rows=2140 width=8) (actual time=0.003..0.008 rows=2 loops=1)
Total runtime: 0.081 ms
(8 rows)
但我们不是仍在扫描 table 并且基本上对每一行执行 where false 如果该函数 returned false 或 where true如果该函数 return 为真?
EXPLAIN ANALYZE
SELECT * FROM omg
WHERE EXISTS (
SELECT 1
FROM omg o
WHERE o.id < 0
);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Result (cost=1.66..3.66 rows=100 width=8) (actual time=0.011..0.011 rows=0 loops=1)
One-Time Filter: [=14=]
InitPlan 1 (returns [=14=])
-> Index Only Scan using omg_pkey on omg o (cost=0.14..1.66 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (id < 0)
Heap Fetches: 0
-> Seq Scan on omg (cost=0.00..2.00 rows=100 width=8) (never executed)
Total runtime: 0.063 ms
(8 rows)
从未执行过说明了一切。
我有以下查询:
SELECT *
FROM t1, t2
WHERE t1.some_id = t2.some_id
and
not exists(select true from some_table where some_column = true)
这里,not exists(select true from some_table where ...)
不是指主查询中的 t1
或 t2
。
这个 not exists(select true from some_table where ...)
是 只执行一次 还是 对于 t1
和 [ 之间的产品中的每个元组 =14=] 哪个 t1.some_id = t2.some_id
是正确的?
也就是说,如果只执行一次,如果结果为false,可以立即返回一个空的table。从逻辑上讲,我们写了:
if (not exists(select true from some_table where some_column = true))
SELECT *
FROM t1, t2
WHERE t1.some_id = t2.some_id
else
empty table
首先,学习使用正确的join
语法:
SELECT *
FROM t1 JOIN
t2
ON t1.some_id = t2.some_id
WHERE not exists(select true from some_table where some_column = true);
子查询应该只执行一次。然而,最终这取决于 Postgres 优化器。您可以使用 LEFT JOIN
:
SELECT t1.*, t2.*
FROM t1 JOIN
t2
ON t1.some_id = t2.some_id LEFT JOIN
some_table st
ON st.some_column = true
WHERE st.some_column IS NULL;
您甚至可以将 EXISTS() 条件包装在 纯 SQL 函数中,它只会执行一次:
-- \i tmp.sql
CREATE TABLE omg
( id integer NOT NULL PRIMARY KEY
, must_pay integer NOT NULL
);
INSERT INTO omg(id, must_pay) VALUES(1,0);
CREATE FUNCTION owe_money() RETURNS BOOLEAN AS
$func$
SELECT EXISTS(SELECT 1
FROM omg o
WHERE o.must_pay > 0
);
$func$
-- language sql;
language sql STABLE;
EXPLAIN
SELECT owe_money();
INSERT INTO omg(id, must_pay) VALUES(2,100);
EXPLAIN
SELECT owe_money();
EXPLAIN
SELECT * FROM omg
WHERE owe_money();
如果将关键字 STABLE
添加到函数定义中,DBMS 将知道 return 值 不会 更改(在相同的 transaction/statement).如果没有 STABLE
,该函数将为每一行调用一次。
QUERY PLAN
-------------------------------------------------------------------------------------------------
Seq Scan on omg (cost=0.00..566.40 rows=713 width=8) (actual time=0.303..0.331 rows=2 loops=1)
Filter: owe_money()
Total runtime: 0.384 ms
(3 rows)
STABLE 函数的结果是一次性过滤器:
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Result (cost=0.25..31.65 rows=2140 width=8) (actual time=0.304..0.316 rows=2 loops=1)
One-Time Filter: owe_money()
-> Seq Scan on omg (cost=0.25..31.65 rows=2140 width=8) (actual time=0.004..0.009 rows=2 loops=1)
Total runtime: 0.379 ms
(4 行)
并且与不相关子查询相同的条件本质上产生与 STABLE 函数相同的计划(现在子查询执行一次,其结果([=17=]
)在一次性过滤器中测试):
EXPLAIN ANALYZE
SELECT * FROM omg
WHERE EXISTS (
SELECT 1
FROM omg o
WHERE o.must_pay > 0
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Result (cost=0.05..31.45 rows=2140 width=8) (actual time=0.022..0.034 rows=2 loops=1)
One-Time Filter: [=13=]
InitPlan 1 (returns [=13=])
-> Seq Scan on omg o (cost=0.00..36.75 rows=713 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Filter: (must_pay > 0)
Rows Removed by Filter: 1
-> Seq Scan on omg (cost=0.00..31.40 rows=2140 width=8) (actual time=0.003..0.008 rows=2 loops=1)
Total runtime: 0.081 ms
(8 rows)
但我们不是仍在扫描 table 并且基本上对每一行执行 where false 如果该函数 returned false 或 where true如果该函数 return 为真?
EXPLAIN ANALYZE
SELECT * FROM omg
WHERE EXISTS (
SELECT 1
FROM omg o
WHERE o.id < 0
);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Result (cost=1.66..3.66 rows=100 width=8) (actual time=0.011..0.011 rows=0 loops=1)
One-Time Filter: [=14=]
InitPlan 1 (returns [=14=])
-> Index Only Scan using omg_pkey on omg o (cost=0.14..1.66 rows=1 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (id < 0)
Heap Fetches: 0
-> Seq Scan on omg (cost=0.00..2.00 rows=100 width=8) (never executed)
Total runtime: 0.063 ms
(8 rows)
从未执行过说明了一切。