索引不适用于 postgres 中的日期列。 Top-N-heapsort 应用的不是索引扫描
Indexing is not applying on the date column in postgres. Top-N-heapsort is applying not Index scan
我有一个 table 'post',它有数百万行。我需要使用日期字段的前 N 条记录,但由于 Top-N-heapsort 而花费了太多时间。我该如何优化这个查询?
注意:在我的数据较少的登台服务器上,它工作正常。索引扫描正在那里应用。
数据库版本:Postgres 12
CREATE TABLE public.posts
(
post_id integer NOT NULL DEFAULT (primary key),
team_id integer NOT NULL,
is_active boolean DEFAULT true,
"createdAt" timestamp with time zone,
"updatedAt" timestamp with time zone,
)
索引列为:
"createdAt DESC NULLS Last",
(team_id ASC NULLS LAST, "createdAt" DESC NULLS LAST),
team_id ASC NULLS LAST
查询:
SELECT p.*
FROM posts p
WHERE team_id = 1
AND p.is_active = true AND "createdAt" IS NOT NULL
ORDER BY "createdAt" DESC
LIMIT 20;
查询计划:
"Limit (cost=138958.67..138958.72 rows=20 width=360) (actual time=356.391..356.419 rows=20 loops=1)"
" -> Sort (cost=138958.67..139078.57 rows=47960 width=360) (actual time=356.389..356.402 rows=20 loops=1)"
" Sort Key: ""createdAt"" DESC"
" Sort Method: top-N heapsort Memory: 34kB"
" -> Index Scan using posts_team_id on posts p (cost=0.44..137682.47 rows=47960 width=360) (actual time=0.042..317.258 rows=52858 loops=1)"
" Index Cond: (team_id = 1)"
" Filter: (is_active AND (""createdAt"" IS NOT NULL))"
"Planning Time: 0.145 ms"
"Execution Time: 356.459 ms"
对于此查询:
SELECT p.*
FROM posts p
WHERE team_id = 1 AND
p.is_active = true AND
"createdAt" IS NOT NULL
ORDER BY "createdAt" DESC
LIMIT 20;
我建议不带 is not null
并在 (team_id, is_active, createdAt desc)
上创建索引。
它不会使用 DESC NULLS LAST 索引来支持 ORDER BY...DESC NULLS FIRST 查询。 (当未指定任何内容时,NULLS FIRST 是 DESC 的默认设置,因此这就是您的查询正在执行的操作)。
您特意在索引中指定排序,但不使其与您的查询匹配,这似乎很奇怪。
您的阶段服务器有不同的索引,或者是 运行 不同的查询。
这是我要使用的索引:
CREATE INDEX ON posts (team_id, "createdAt") WHERE is_active;
支持WHERE
条件和ORDER BY
。
我有一个 table 'post',它有数百万行。我需要使用日期字段的前 N 条记录,但由于 Top-N-heapsort 而花费了太多时间。我该如何优化这个查询?
注意:在我的数据较少的登台服务器上,它工作正常。索引扫描正在那里应用。
数据库版本:Postgres 12
CREATE TABLE public.posts
(
post_id integer NOT NULL DEFAULT (primary key),
team_id integer NOT NULL,
is_active boolean DEFAULT true,
"createdAt" timestamp with time zone,
"updatedAt" timestamp with time zone,
)
索引列为:
"createdAt DESC NULLS Last",
(team_id ASC NULLS LAST, "createdAt" DESC NULLS LAST),
team_id ASC NULLS LAST
查询:
SELECT p.*
FROM posts p
WHERE team_id = 1
AND p.is_active = true AND "createdAt" IS NOT NULL
ORDER BY "createdAt" DESC
LIMIT 20;
查询计划:
"Limit (cost=138958.67..138958.72 rows=20 width=360) (actual time=356.391..356.419 rows=20 loops=1)"
" -> Sort (cost=138958.67..139078.57 rows=47960 width=360) (actual time=356.389..356.402 rows=20 loops=1)"
" Sort Key: ""createdAt"" DESC"
" Sort Method: top-N heapsort Memory: 34kB"
" -> Index Scan using posts_team_id on posts p (cost=0.44..137682.47 rows=47960 width=360) (actual time=0.042..317.258 rows=52858 loops=1)"
" Index Cond: (team_id = 1)"
" Filter: (is_active AND (""createdAt"" IS NOT NULL))"
"Planning Time: 0.145 ms"
"Execution Time: 356.459 ms"
对于此查询:
SELECT p.*
FROM posts p
WHERE team_id = 1 AND
p.is_active = true AND
"createdAt" IS NOT NULL
ORDER BY "createdAt" DESC
LIMIT 20;
我建议不带 is not null
并在 (team_id, is_active, createdAt desc)
上创建索引。
它不会使用 DESC NULLS LAST 索引来支持 ORDER BY...DESC NULLS FIRST 查询。 (当未指定任何内容时,NULLS FIRST 是 DESC 的默认设置,因此这就是您的查询正在执行的操作)。
您特意在索引中指定排序,但不使其与您的查询匹配,这似乎很奇怪。
您的阶段服务器有不同的索引,或者是 运行 不同的查询。
这是我要使用的索引:
CREATE INDEX ON posts (team_id, "createdAt") WHERE is_active;
支持WHERE
条件和ORDER BY
。