postgresql 以另一种方式重写查询

Question

我需要帮助重写查询以提高性能。我认为下面的查询很慢，因为 OR 部分和 products table 的两个子查询（所以两次扫描） category table.[=18= 中的每个键]

SELECT key
FROM category c
WHERE (1=1)
AND (  ((EXISTS (SELECT * from products p
                  WHERE p.attribute_key=2 
                   AND p.category_key=c.key 
                   AND ((value && CAST(ARRAY['Active', 'active'] AS text[])))
                   AND p.status='active'
        ))
     OR(NOT EXISTS(SELECT * from products p
                    WHERE p.attribute_key=2 
                     AND p.category_key=c.key 
                     AND p.status='active') 
        ))
    )
AND c.status='active'
AND c.type_key=4

预期输出低于（9 行）

如果 type_key=4 且 category.status='active' 且

，则查询类别 table 中的 returns 个键

products table 的值='Active' 或 'active' 且 attribute_key=2 且 status='active. （这是查询中的 EXISTS 部分）
即使产品 table 没有任何 attribute_key 或有 attribute_key!=2 条记录 type_key=4。（这是查询中的 OR( NOT EXISTS) 部分）示例：type_key 13 和 3

查询解释分析计划如下。索引使用得很好。

我希望可以通过另一种方式或更改查询中的 OR 部分来改进查询

样本数据在这里dbfiddle

Aggregate  (cost=3156220.03..3156220.04 rows=1 width=8) (actual time=86100.329..86100.342 rows=1 loops=1)
  ->  Index Scan using category_type_key_status on category c  (cost=0.43..3155906.64 rows=125355 width=0) (actual time=12.618..85925.747 rows=120852 loops=1)
        Index Cond: ((type_key = 4) AND (status = 'active'))
        Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (NOT (SubPlan 3)))
        Rows Removed by Filter: 86879
        SubPlan 1
          ->  Index Scan using products_category_key_attribute_key_status on products p  (cost=0.56..8.59 rows=1 width=0) (actual time=0.332..0.332 rows=1 loops=207731)
                Index Cond: ((category_key = c.key) AND (attribute_key = 2) AND (status = 'active'))
                Filter: (value && '{Active,active}'::text[])
                Rows Removed by Filter: 0
        SubPlan 2
          ->  Gather  (cost=1000.00..1155110.30 rows=8916 width=4) (never executed)
                Workers Planned: 2
                Workers Launched: 0
                ->  Parallel Seq Scan on products p_1  (cost=0.00..1153218.70 rows=3715 width=4) (never executed)
                      Filter: ((value && '{Active,active}'::text[]) AND (attribute_key = 2) AND (status = 'active'))
        SubPlan 3
          ->  Index Only Scan using  products_category_key_attribute_key_status on products p_2  (cost=0.56..8.58 rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=86933)
                Index Cond: ((category_key = c.key) AND (attribute_key = 2) AND (status = 'active'))
                Heap Fetches: 11497
Planning Time: 35.808 ms
Planning Time: 35.808 ms

Answer 1

产品 table 的两个谓词几乎完全相同。您可以简单地加入此 table 一次并使用 OR 来应用任一条件：

SELECT c.key
  FROM category c
  LEFT OUTER
  JOIN products p
    ON p.category_key = c.key
   AND p.status = 'active'
   AND p.attribute_key = 2
 WHERE c.status='active'
   AND c.type_key=4
   AND (   p.category_key IS NULL -- NOT EXISTS
        OR ((value && CAST(ARRAY['Active', 'active'] AS text[])))) -- or value matches

dbfiddle with results

postgresql 以另一种方式重写查询

postgresql re-write query in another way

postgresql

exists

not-exists

where-in

postgresql-performance