为什么 PostgreSQL 没有正确使用索引?

Why PostgreSQL not using index properly?

架构:

create table records(
  id         varchar,
  updated_at bigint
);
create index index1 on records (updated_at, id);

查询。它遍历最近更新的记录。获取 10 条记录,记住最后一条,然后获取下 10 条,依此类推。

select * from objects
where updated_at > '1' or (updated_at = '1' and id > 'some-id')
order by updated_at, id
limit 10;

它使用了索引,但它没有明智地使用它,还应用了过滤器并处理了大量的记录,请参阅下面查询说明中的Rows Removed by Filter: 31575

奇怪的是,如果您删除 or 并保留左侧或右侧条件 - 它对两者都适用。但是,如果同时使用 or.

两个条件,似乎无法弄清楚如何正确应用索引
Limit  (cost=0.42..19.03 rows=20 width=1336) (actual time=542.475..542.501 rows=20 loops=1)
   ->  Index Scan using index1 on records  (cost=0.42..426791.29 rows=458760 width=1336) (actual time=542.473..542.494 rows=20 loops=1)
         Filter: ((updated_at > '1'::bigint) OR ((updated_at = '1'::bigint) AND ((id)::text > 'some-id'::text)))
         Rows Removed by Filter: 31575
 Planning time: 0.180 ms
 Execution time: 542.532 ms
(6 rows)

Postgres 版本是 9.6

我会把它作为两个单独的查询来尝试,像这样组合它们的结果:

select *
from
  (
    select   *
    from     objects
    where    updated_at > 1
    order by updated_at, id
    limit    10
    union all
    select   *
    from     objects
    where    updated_at = 1
      and    id > 'some-id'
    order by updated_at, id
    limit    10
  ) t
order by updated_at, id
limit    10

我的猜测是,这两个查询都可以很好地优化,并且 运行 都将比当前查询更有效。

如果可能的话,我也会让这些列不为空。

优化了 PostgreSQL 对索引的调用。

For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5. Index entries with c >= 77 would be skipped, but they'd still have to be scanned through. This index could in principle be used for queries that have constraints on b and/or c with no constraint on a — but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.

https://www.postgresql.org/docs/9.6/static/indexes-multicolumn.html