当数据集增加时,Postgres 将查询从仅索引扫描更改为位图扫描
Postgres changing the query from index Only scan to bit map scan when data set increases
我有两个相同的查询,但 where 条件值不同
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 1565 and account_id = 225 and deleted_at is NULL group by survey_contact_id, relation_id;
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 888 and account_id = 12 and deleted_at is NULL group by survey_contact_id, relation_id;
当我 运行 这两个查询时它们都产生不同的结果
第一次查询结果
GroupAggregate (cost=0.28..8.32 rows=1 width=24) (actual time=0.016..0.021 rows=4 loops=1)
Group Key: survey_contact_id, relation_id
-> Index Only Scan using test on nomination (cost=0.28..8.30 rows=1 width=8) (actual time=0.010..0.012 rows=5 loops=1)
Index Cond: ((account_id = 225) AND (survey_id = 1565))
Heap Fetches: 5
Planning time: 0.148 ms
Execution time: 0.058 ms
第二个
GroupAggregate (cost=11.08..11.12 rows=2 width=24) (actual time=0.015..0.015 rows=0 loops=1)
Group Key: survey_contact_id, relation_id
-> Sort (cost=11.08..11.08 rows=2 width=8) (actual time=0.013..0.013 rows=0 loops=1)
Sort Key: survey_contact_id, relation_id
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on nomination (cost=4.30..11.07 rows=2 width=8) (actual time=0.008..0.008 rows=0 loops=1)
Recheck Cond: ((account_id = 12) AND (survey_id = 888) AND (deleted_at IS NULL))
-> Bitmap Index Scan on test (cost=0.00..4.30 rows=2 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: ((account_id = 12) AND (survey_id = 888))
Planning time: 0.149 ms
Execution time: 0.052 ms
谁能解释一下为什么 Postgres 进行位图扫描而不是仅索引扫描?
简而言之,Postgres 有一个基于成本的方法,因此根据它的统计数据,它估计在第二种情况下这样做的成本要低。
在您的情况下,每个查询的总成本(估计)分别为 8.32
和 11.12
。您可以通过 运行 set enable_bitmapscan = off
.
查看第二个查询的仅索引扫描的成本
请注意,根据其统计数据,Postgres 估计第一个查询将 return 1 行(实际上是 4),第二个查询将 return 2 行(实际上是 0)。
有几种方法可以获得更好的统计信息,但是如果analyze
(or autovacuum) hasn't been run on that table for a while, that is a common cause of bad estimates. Another tell-tale that vacuum may not have been run recently (at least on this table) is the Heap Fetches: 5
你可以在第一个查询计划中看到。
我对你问题的“当数据集增加时”部分感到困惑,如果相关,请在这方面添加更多上下文。
最后,如果您还没有计划升级 PostgreSQL,我强烈建议您尽快这样做。 9.6 接近 out of support,版本 10、11、12 和 13 各自包含许多以性能为中心的功能。
我有两个相同的查询,但 where 条件值不同
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 1565 and account_id = 225 and deleted_at is NULL group by survey_contact_id, relation_id;
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 888 and account_id = 12 and deleted_at is NULL group by survey_contact_id, relation_id;
当我 运行 这两个查询时它们都产生不同的结果
第一次查询结果
GroupAggregate (cost=0.28..8.32 rows=1 width=24) (actual time=0.016..0.021 rows=4 loops=1)
Group Key: survey_contact_id, relation_id
-> Index Only Scan using test on nomination (cost=0.28..8.30 rows=1 width=8) (actual time=0.010..0.012 rows=5 loops=1)
Index Cond: ((account_id = 225) AND (survey_id = 1565))
Heap Fetches: 5
Planning time: 0.148 ms
Execution time: 0.058 ms
第二个
GroupAggregate (cost=11.08..11.12 rows=2 width=24) (actual time=0.015..0.015 rows=0 loops=1)
Group Key: survey_contact_id, relation_id
-> Sort (cost=11.08..11.08 rows=2 width=8) (actual time=0.013..0.013 rows=0 loops=1)
Sort Key: survey_contact_id, relation_id
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on nomination (cost=4.30..11.07 rows=2 width=8) (actual time=0.008..0.008 rows=0 loops=1)
Recheck Cond: ((account_id = 12) AND (survey_id = 888) AND (deleted_at IS NULL))
-> Bitmap Index Scan on test (cost=0.00..4.30 rows=2 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: ((account_id = 12) AND (survey_id = 888))
Planning time: 0.149 ms
Execution time: 0.052 ms
谁能解释一下为什么 Postgres 进行位图扫描而不是仅索引扫描?
简而言之,Postgres 有一个基于成本的方法,因此根据它的统计数据,它估计在第二种情况下这样做的成本要低。
在您的情况下,每个查询的总成本(估计)分别为 8.32
和 11.12
。您可以通过 运行 set enable_bitmapscan = off
.
请注意,根据其统计数据,Postgres 估计第一个查询将 return 1 行(实际上是 4),第二个查询将 return 2 行(实际上是 0)。
有几种方法可以获得更好的统计信息,但是如果analyze
(or autovacuum) hasn't been run on that table for a while, that is a common cause of bad estimates. Another tell-tale that vacuum may not have been run recently (at least on this table) is the Heap Fetches: 5
你可以在第一个查询计划中看到。
我对你问题的“当数据集增加时”部分感到困惑,如果相关,请在这方面添加更多上下文。
最后,如果您还没有计划升级 PostgreSQL,我强烈建议您尽快这样做。 9.6 接近 out of support,版本 10、11、12 和 13 各自包含许多以性能为中心的功能。