防止在 Postgres 中为特定查询使用索引

Prevent usage of index for a particular query in Postgres

我在 Postgres 数据库中查询速度很慢。使用 explain analyze,我可以看到 Postgres 在两个不同的索引上进行位图索引扫描,然后在两个结果集上进行位图 AND 扫描。

删除其中一个索引会使评估速度提高十倍(位图索引扫描仍用于第一个索引)。但是,删除的索引在其他查询中很有用。

查询:

select
  booking_id
from
  booking
where
  substitute_confirmation_token is null
  and date_trunc('day', from_time) >= cast('01/25/2016 14:23:00.004' as date)
  and from_time >= '01/25/2016 14:23:00.004'
  and type = 'LESSON_SUBSTITUTE'
  and valid
order by
  booking_id;

索引:

"idx_booking_lesson_substitute_day" btree (date_trunc('day'::text, from_time)) WHERE valid AND type::text = 'LESSON_SUBSTITUTE'::text
"booking_substitute_confirmation_token_key" UNIQUE CONSTRAINT, btree (substitute_confirmation_token)

查询计划:

Sort  (cost=287.26..287.26 rows=1 width=8) (actual time=711.371..711.377 rows=44 loops=1)
  Sort Key: booking_id
  Sort Method: quicksort  Memory: 27kB
  Buffers: shared hit=8 read=7437 written=1
  ->  Bitmap Heap Scan on booking  (cost=275.25..287.25 rows=1 width=8) (actual time=711.255..711.294 rows=44 loops=1)
        Recheck Cond: ((date_trunc('day'::text, from_time) >= '2016-01-25'::date) AND valid AND ((type)::text = 'LESSON_SUBSTITUTE'::text) AND (substitute_confirmation_token IS NULL))
        Filter: (from_time >= '2016-01-25 14:23:00.004'::timestamp without time zone)
        Buffers: shared hit=5 read=7437 written=1
        ->  BitmapAnd  (cost=275.25..275.25 rows=3 width=0) (actual time=711.224..711.224 rows=0 loops=1)
              Buffers: shared hit=5 read=7433 written=1
              ->  Bitmap Index Scan on idx_booking_lesson_substitute_day  (cost=0.00..20.50 rows=594 width=0) (actual time=0.080..0.080 rows=72 loops=1)
                    Index Cond: (date_trunc('day'::text, from_time) >= '2016-01-25'::date)
                    Buffers: shared hit=5 read=1
              ->  Bitmap Index Scan on booking_substitute_confirmation_token_key  (cost=0.00..254.50 rows=13594 width=0) (actual time=711.102..711.102 rows=2718734 loops=1)
                    Index Cond: (substitute_confirmation_token IS NULL)
                    Buffers: shared read=7432 written=1
Total runtime: 711.436 ms

我可以阻止在 Postgres 中对特定查询使用特定索引吗?

你的聪明解决方案

您已经为您的特定情况找到了一个聪明的解决方案:仅涵盖稀有值的部分唯一索引,因此 Postgres 不会(不能)将索引用于常见的 NULL 值。

CREATE UNIQUE INDEX booking_substitute_confirmation_uni
ON booking (substitute_confirmation_token)
WHERE substitute_confirmation_token IS NOT NULL;

这是一本部分索引的教科书use-case。 Literally! 手册中有一个类似的例子和这些完美匹配的建议:

Finally, a partial index can also be used to override the system's query plan choices. Also, data sets with peculiar distributions might cause the system to use an index when it really should not. In that case the index can be set up so that it is not available for the offending query. Normally, PostgreSQL makes reasonable choices about index usage (e.g., it avoids them when retrieving common values, so the earlier example really only saves index size, it is not required to avoid index usage), and grossly incorrect plan choices are cause for a bug report.

Keep in mind that setting up a partial index indicates that you know at least as much as the query planner knows, in particular you know when an index might be profitable. Forming this knowledge requires experience and understanding of how indexes in PostgreSQL work. In most cases, the advantage of a partial index over a regular index will be minimal.

您评论:The table has few millions of rows and just few thousands of rows with not null values,所以这是一个完美的use-case。它甚至会加快对 substitute_confirmation_token 的 non-null 值的查询,因为索引现在 小得多

问题的答案

回答您原来的问题:不可能 "disable" 特定查询的现有索引。您将不得不放弃它,但那太昂贵了。

假掉落指数

可以 在事务中删除索引,运行 您的 SELECT 然后,使用 ROLLBACK 而不是提交。那是 fast,但请注意 (per documentation):

A normal DROP INDEX acquires exclusive lock on the table, blocking other accesses until the index drop can be completed.

所以这对 multi-user 环境不利。

BEGIN;
DROP INDEX big_user_id_created_at_idx;
SELECT ...;
ROLLBACK;  -- so the index is preserved after all

更详细的统计数据

不过,通常情况下,提高列的 STATISTICS 目标就足够了,因此 Postgres 可以更可靠地识别公共值并避免为这些值创建索引。尝试:

ALTER TABLE booking ALTER COLUMN substitute_confirmation_token SET STATISTICS 2000;

然后:ANALYZE booking; 在您再次尝试查询之前。 2000 是一个示例值。相关:

  • Keep PostgreSQL from sometimes choosing a bad query plan