如何优化此查询的执行时间
How to optimize the execution time for this query
我有以下查询:
SELECT "factures"."id"
FROM "factures"
WHERE ( "factures"."id" NOT IN (SELECT DISTINCT( "echeances"."facture_id" )
FROM "echeances"
WHERE "echeances"."type_decheance" IN ( 2, 3, 4, 5, 8, 9 )
AND "echeances"."facture_id" IS NOT NULL
LIMIT 100000)) <----- removing this limit makes the query take enormous time
ORDER BY "factures"."id" DESC
这里是限制为100 000的解释分析:
Index Only Scan Backward using factures_id_pkey on factures (cost=93516.76..211292.17 rows=530570 width=4) (actual time=1425.701..11466.759 rows=963698 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 99997
Heap Fetches: 1063695
SubPlan 1
-> Limit (cost=0.43..93266.34 rows=100000 width=4) (actual time=0.022..1229.925 rows=100000 loops=1)
-> Unique (cost=0.43..264837.37 rows=283959 width=4) (actual time=0.022..1090.692 rows=100000 loops=1)
-> Index Scan using echeances__facture_id__idx on echeances (cost=0.43..262883.29 rows=781631 width=4) (actual time=0.020..819.735 rows=100167 loops=1)
Index Cond: (facture_id IS NOT NULL)
" Filter: (type_decheance = ANY ('{2,3,4,5,8,9}'::integer[]))"
Rows Removed by Filter: 156995
Planning time: 0.249 ms
Execution time: 11960.423 ms
下面是无限解释:
-> Unique (cost=0.43..264837.37 rows=283959 width=4)
Index Only Scan Backward using factures_id_pkey on factures (cost=0.86..142233669403.15 rows=530570 width=4)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.43..267367.16 rows=283959 width=4)
-> Index Scan using echeances__facture_id__idx on echeances (cost=0.43..262883.29 rows=781631 width=4)
Index Cond: (facture_id IS NOT NULL)
" Filter: (type_decheance = ANY ('{2,3,4,5,8,9}'::integer[]))"
这是架构
Table "factures"
id
Table "echeances"
id
facture_id (fk)
type_decheance (integer)
问题是 "factures" 和 "echeances" table 有大量的行,如果 :
limit在子查询中指定,比如limit 100000,查询时间快
limit没有在子查询中指定,耗费了很多时间,我等了15多分钟只好停止了。
目标是让这个查询 运行 在合理的时间内不受限制。
切换到NOT EXISTS
:
SELECT f.id
FROM factures f
WHERE NOT EXISTS (SELECT 1
FROM echeances e
WHERE e.facture_id = f.id AND
e.type_decheance IN ( 2, 3, 4, 5, 8, 9 )
)
ORDER BY f.id DESC;
请注意,我删除了所有双引号。不要引用您的标识符。它只会让查询更难编写和阅读。
那么您需要 echeances(facture_id, type_decheance)
上的索引。这应该非常快,因为可以通过简单的索引查找来检查每个制造 ID。
我有以下查询:
SELECT "factures"."id"
FROM "factures"
WHERE ( "factures"."id" NOT IN (SELECT DISTINCT( "echeances"."facture_id" )
FROM "echeances"
WHERE "echeances"."type_decheance" IN ( 2, 3, 4, 5, 8, 9 )
AND "echeances"."facture_id" IS NOT NULL
LIMIT 100000)) <----- removing this limit makes the query take enormous time
ORDER BY "factures"."id" DESC
这里是限制为100 000的解释分析:
Index Only Scan Backward using factures_id_pkey on factures (cost=93516.76..211292.17 rows=530570 width=4) (actual time=1425.701..11466.759 rows=963698 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 99997
Heap Fetches: 1063695
SubPlan 1
-> Limit (cost=0.43..93266.34 rows=100000 width=4) (actual time=0.022..1229.925 rows=100000 loops=1)
-> Unique (cost=0.43..264837.37 rows=283959 width=4) (actual time=0.022..1090.692 rows=100000 loops=1)
-> Index Scan using echeances__facture_id__idx on echeances (cost=0.43..262883.29 rows=781631 width=4) (actual time=0.020..819.735 rows=100167 loops=1)
Index Cond: (facture_id IS NOT NULL)
" Filter: (type_decheance = ANY ('{2,3,4,5,8,9}'::integer[]))"
Rows Removed by Filter: 156995
Planning time: 0.249 ms
Execution time: 11960.423 ms
下面是无限解释:
-> Unique (cost=0.43..264837.37 rows=283959 width=4)
Index Only Scan Backward using factures_id_pkey on factures (cost=0.86..142233669403.15 rows=530570 width=4)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.43..267367.16 rows=283959 width=4)
-> Index Scan using echeances__facture_id__idx on echeances (cost=0.43..262883.29 rows=781631 width=4)
Index Cond: (facture_id IS NOT NULL)
" Filter: (type_decheance = ANY ('{2,3,4,5,8,9}'::integer[]))"
这是架构
Table "factures"
id
Table "echeances"
id
facture_id (fk)
type_decheance (integer)
问题是 "factures" 和 "echeances" table 有大量的行,如果 :
limit在子查询中指定,比如limit 100000,查询时间快
limit没有在子查询中指定,耗费了很多时间,我等了15多分钟只好停止了。
目标是让这个查询 运行 在合理的时间内不受限制。
切换到NOT EXISTS
:
SELECT f.id
FROM factures f
WHERE NOT EXISTS (SELECT 1
FROM echeances e
WHERE e.facture_id = f.id AND
e.type_decheance IN ( 2, 3, 4, 5, 8, 9 )
)
ORDER BY f.id DESC;
请注意,我删除了所有双引号。不要引用您的标识符。它只会让查询更难编写和阅读。
那么您需要 echeances(facture_id, type_decheance)
上的索引。这应该非常快,因为可以通过简单的索引查找来检查每个制造 ID。