Postgres select 大型(1500 万行)table 极慢,即使有索引
Postgres select on large (15m rows) table extremely slow, even with index
我正在尝试 运行 EXPLAIN ANALYZE
但它根本无法完成,因为它太慢了。如果是这样,我会 post 结果,但现在,这里是 EXPLAIN
.
查询:
EXPLAIN SELECT
*
FROM
"Posts" AS "Post"
WHERE
(
"Post"."featurePostOnDate" > '2020-06-25 19:28:07.816 +00:00'
OR (
"Post"."featurePostOnDate" IS NULL
AND "Post"."userId" IN (6863684)
)
)
AND "Post"."private" IS NULL
ORDER BY
"Post"."featurePostOnDate" DESC NULLS LAST,
"Post"."createdAt" DESC NULLS LAST
LIMIT 10;
结果:
Limit (cost=0.56..110.92 rows=10 width=1136)
-> Index Scan using posts_updated_following_feed_idx on "Posts" "Post" (cost=0.56..284949.60 rows=25819 width=1136)
Filter: (("featurePostOnDate" > '2020-06-25 19:28:07.816+00'::timestamp with time zone) OR (("featurePostOnDate" IS NULL) AND ("userId" = 6863684)))
索引:
CREATE INDEX "posts_updated_following_feed_idx" ON "public"."Posts" USING btree (
"featurePostOnDate" DESC NULLS LAST,
"createdAt" DESC NULLS LAST
)
WHERE
private IS NULL;
因此,由于您有 1500 万行,并且您使用了 ANALYZE
。使用 ANALYZE
实际运行查询,您可以从这里引用它 https://www.postgresql.org/docs/9.1/sql-explain.html.
并且在 WHERE
子句中您使用了未编入索引的字段
WHERE
(
"Post"."featurePostOnDate" > '2020-06-25 19:28:07.816 +00:00'
OR (
"Post"."featurePostOnDate" IS NULL
AND "Post"."userId" IN (6863684)
)
)
AND "Post"."private" IS NULL
所以它实际上是在进行顺序扫描以过滤掉行
Filter: (("featurePostOnDate" > '2020-06-25 19:28:07.816+00'::timestamp with time zone) OR (("featurePostOnDate" IS NULL) AND ("userId" = 6863684)))
这可能是您的查询速度慢的原因。
您可能需要 (featurePostOnDate, userId, private)
和 (featurePostOnDate, private)
.
上的复合索引
希望对您有所帮助。
您需要将其编写为两个单独的查询,一个用于 OR 的每个分支。将限制应用于每个查询,然后组合它们并再次联合应用限制。但是,如果第一个分支找到十行,则第二个分支根本不需要 运行,因为所有非 NULL 日期都已经排在第一位。
我正在尝试 运行 EXPLAIN ANALYZE
但它根本无法完成,因为它太慢了。如果是这样,我会 post 结果,但现在,这里是 EXPLAIN
.
查询:
EXPLAIN SELECT
*
FROM
"Posts" AS "Post"
WHERE
(
"Post"."featurePostOnDate" > '2020-06-25 19:28:07.816 +00:00'
OR (
"Post"."featurePostOnDate" IS NULL
AND "Post"."userId" IN (6863684)
)
)
AND "Post"."private" IS NULL
ORDER BY
"Post"."featurePostOnDate" DESC NULLS LAST,
"Post"."createdAt" DESC NULLS LAST
LIMIT 10;
结果:
Limit (cost=0.56..110.92 rows=10 width=1136)
-> Index Scan using posts_updated_following_feed_idx on "Posts" "Post" (cost=0.56..284949.60 rows=25819 width=1136)
Filter: (("featurePostOnDate" > '2020-06-25 19:28:07.816+00'::timestamp with time zone) OR (("featurePostOnDate" IS NULL) AND ("userId" = 6863684)))
索引:
CREATE INDEX "posts_updated_following_feed_idx" ON "public"."Posts" USING btree (
"featurePostOnDate" DESC NULLS LAST,
"createdAt" DESC NULLS LAST
)
WHERE
private IS NULL;
因此,由于您有 1500 万行,并且您使用了 ANALYZE
。使用 ANALYZE
实际运行查询,您可以从这里引用它 https://www.postgresql.org/docs/9.1/sql-explain.html.
并且在 WHERE
子句中您使用了未编入索引的字段
WHERE
(
"Post"."featurePostOnDate" > '2020-06-25 19:28:07.816 +00:00'
OR (
"Post"."featurePostOnDate" IS NULL
AND "Post"."userId" IN (6863684)
)
)
AND "Post"."private" IS NULL
所以它实际上是在进行顺序扫描以过滤掉行
Filter: (("featurePostOnDate" > '2020-06-25 19:28:07.816+00'::timestamp with time zone) OR (("featurePostOnDate" IS NULL) AND ("userId" = 6863684)))
这可能是您的查询速度慢的原因。
您可能需要 (featurePostOnDate, userId, private)
和 (featurePostOnDate, private)
.
希望对您有所帮助。
您需要将其编写为两个单独的查询,一个用于 OR 的每个分支。将限制应用于每个查询,然后组合它们并再次联合应用限制。但是,如果第一个分支找到十行,则第二个分支根本不需要 运行,因为所有非 NULL 日期都已经排在第一位。