Postgres 几乎在每个查询中都使用 primary_key 索引

Question

我们正在将 postgres 数据库从版本 9.3.14 升级到 9.4.9。我们目前处于测试阶段。我们在测试时遇到了一个问题，当数据库更新到 9.4.9 时会导致 CPU 使用率很高。有查询 Postgres 9.4 使用 primary_key_index 而那里有更便宜的选项。例如，运行解释以下查询的分析：

SELECT  a.id as a_id, b.col_id as col_id
FROM a
INNER JOIN b ON b.id = a.b_id
WHERE (a.col_text = 'pqrs' AND a.col_int = 1)
ORDER BY a.id ASC LIMIT 1

给出这个：

Limit  (cost=0.87..4181.94 rows=1 width=8) (actual time=93014.991..93014.992 rows=1 loops=1)
 ->  Nested Loop  (cost=0.87..1551177.78 rows=371 width=8) (actual time=93014.990..93014.990 rows=1 loops=1)
       ->  Index Scan using a_pkey on a  (cost=0.43..1548042.20 rows=371 width=8) (actual time=93014.968..93014.968 rows=1 loops=1)
             Filter: ((col_int = 1) AND ((col_text)::text = 'pqrs'::text))
             Rows Removed by Filter: 16114217
       ->  Index Scan using b_pkey on b  (cost=0.43..8.44 rows=1 width=8) (actual time=0.014..0.014 rows=1 loops=1)
             Index Cond: (id = a.b_id)
Planning time: 0.291 ms
Execution time: 93015.041 ms

虽然 9.3.14 中相同查询的查询计划给出了这个：

Limit  (cost=17.06..17.06 rows=1 width=8) (actual time=5.066..5.067 rows=1 loops=1)
 ->  Sort  (cost=17.06..17.06 rows=1 width=8) (actual time=5.065..5.065 rows=1 loops=1)
       Sort Key: a.id
       Sort Method: quicksort  Memory: 25kB
       ->  Nested Loop  (cost=1.00..17.05 rows=1 width=8) (actual time=5.047..5.049 rows=1 loops=1)
             ->  Index Scan using index_a_on_col_text on a  (cost=0.56..8.58 rows=1 width=8) (actual time=3.154..3.155 rows=1 loops=1)
                   Index Cond: ((col_text)::text = 'pqrs'::text)
                   Filter: (col_int = 1)
             ->  Index Scan using b_pkey on b  (cost=0.43..8.46 rows=1 width=8) (actual time=1.888..1.889 rows=1 loops=1)
                   Index Cond: (id = a.b_id)
Total runtime: 5.112 ms

如果我从查询中删除 ORDER BY 子句，则查询可以使用适当的索引正常工作。我可以理解，在这种情况下（使用 ORDER BY），规划器试图使用主键索引来扫描所有行并获取有效行。但很明显，显式使用排序要便宜得多。

我研究了 Postgres 参数，例如 enable_indexscan 和 enable_seqscan，默认情况下是上。我们想将其留在数据库中以决定进行索引扫描或顺序扫描。我们还尝试调整 effective_cache_size、random_page_cost 和 seq_page_cost。 enable_sort也在。

这不仅发生在这个特定的查询中，还有一些其他查询正在使用 primary_key_index 而不是其他可能的有效方法。

P.S.:

Answer 1

向 AWS Support 提交案例后，这是我得到的：

I understand that you want to know why you have degraded performance on your recently upgraded instance. This is the expected and general behavior of upgrade on a Postgres instance. Once upgrade is completed, you need to run ANALYZE on each user database to update statistics of the tables. This also makes SQLs performing better. A better way to do that is using vacuumdb[1], like this:

vacuumdb -U [your user] -d [your database] -Ze -h [your rds endpoint]

It will optmize your database execution plan only, not freeing space, but will take less time than a complete vacuum.

这已经解决了这个问题。希望这可以帮助其他偶然发现此类问题的人。

Postgres 几乎在每个查询中都使用 primary_key 索引

Postgres using primary_key index in almost every query

database

postgresql

indexing

query-performance

query-planner