这个 Postgres 查询不是最优的吗？

Question

我遇到的问题是，在 Postgres 9.2 中，以下查询需要很长时间才能达到运行：

select coalesce(sum(col_a), 0) 
from table_a 
where tid not in ( 
    select distinct tid 
    from table_b 
    where col_b = 13 )

注意tid是table_a中的主键。对于 table_b，tid 被索引并引用 table_a 作为外键。

此问题主要发生在磁盘快满且 table 中正在重新编制索引时。我不是数据库专家，我真的不明白问题可能是什么。

有人可以帮助理解问题/告诉我是否有更优化的查询吗？

Answer 1

我会尝试 NOT EXISTS :

select coalesce(sum(a.col_a), 0) 
from table_a a
where not exists (select 1 from table_b b where b.tid = a.tid and b.col_b = 13);

此外，汇总也有帮助：

select coalesce(sum(a.col_a), 0) 
from table_a a inner join
     table_b b
     on b.tid = a.tid
group by a.tid
having count(*) filter (where b.col_b = 13) = 0;

另一种选择是使用 left join :

select coalesce(sum(a.col_a), 0) 
from table_a a left join
     table_b b
     on b.tid = a.tid and b.col_b = 13
where b.tid is null;

为了获得最佳性能，索引会有所帮助 table_a(tid, col_a)、table_b(tid, col_b)

Answer 2

我会推荐 NOT EXISTS 使用正确的索引 。因此，将查询写为：

select coalesce(sum(col_a), 0) 
from table_a a
where not exists (select 1
                  from table_b b
                  where b.tid = a.tid and b.col_b = 13
                 );

您想要的索引在 table_b(tid, col_b):

create index idx_table_b_tid_col_b on table_b(id, col_b);

这个 Postgres 查询不是最优的吗？

Is this Postgres query not optimal?

sql

postgresql

postgresql-9.2