Select 明显很慢
Select distinct very slow
我有一个 table 用于存储带有外部 ID 的行。我经常需要 select 给定外部 ID 的最新时间戳。现在它是我的应用程序的瓶颈
查询:
SELECT DISTINCT ON ("T1"."external_id") "T1"."external_id", "T1"."timestamp"
FROM "T1"
WHERE "T1"."external_id" IN ('825889935', '825904511')
ORDER BY "T1"."external_id" ASC, "T1"."timestamp" DESC
解释:
Unique (cost=169123.13..169123.19 rows=12 width=18) (actual time=1327.443..1334.118 rows=2 loops=1)
-> Sort (cost=169123.13..169123.16 rows=12 width=18) (actual time=1327.441..1334.112 rows=2 loops=1)
Sort Key: external_id, timestamp DESC
Sort Method: quicksort Memory: 25kB
-> Gather (cost=1000.00..169122.91 rows=12 width=18) (actual time=752.577..1334.056 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on T1 (cost=0.00..168121.71 rows=5 width=18) (actual time=921.649..1300.556 rows=1 loops=3)
Filter: ((external_id)::text = ANY ('{825889935,825904511}'::text[]))
Rows Removed by Filter: 1168882
Planning Time: 0.592 ms
Execution Time: 1334.159 ms
我该怎么做才能使这个查询更快?或者我应该使用完全不同的查询?
更新:
按照@jahrl 的要求添加了新的查询计划。看起来查询速度更快,但之前的查询计划是在负载下制定的,现在它的工作时间差不多
Finalize GroupAggregate (cost=169121.80..169123.21 rows=12 width=18) (actual time=321.009..322.410 rows=2 loops=1)
Group Key: external_id
-> Gather Merge (cost=169121.80..169123.04 rows=10 width=18) (actual time=321.003..322.403 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial GroupAggregate (cost=168121.77..168121.86 rows=5 width=18) (actual time=318.671..318.672 rows=1 loops=3)
Group Key: external_id
-> Sort (cost=168121.77..168121.78 rows=5 width=18) (actual time=318.664..318.665 rows=1 loops=3)
Sort Key: external_id
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Parallel Seq Scan on T1 (cost=0.00..168121.71 rows=5 width=18) (actual time=144.338..318.611 rows=1 loops=3)
Filter: ((external_id)::text = ANY ('{825889935,825904511}'::text[]))
Rows Removed by Filter: 1170827
Planning Time: 0.093 ms
Execution Time: 322.441 ms
也许基本的 GROUP BY
查询会执行得更好?
SELECT "T1"."external_id", MAX("T1"."timestamp") as "timestamp"
FROM "T1"
WHERE "T1"."external_id" IN ('825889935', '825904511')
GROUP BY "T1"."external_id"
ORDER BY "T1"."external_id" ASC
而且,正如@melcher 所说,不要忘记 ("external_id", "timestamp") 索引!
查看过滤器删除的行数并在 external_id
上创建索引。
我有一个 table 用于存储带有外部 ID 的行。我经常需要 select 给定外部 ID 的最新时间戳。现在它是我的应用程序的瓶颈
查询:
SELECT DISTINCT ON ("T1"."external_id") "T1"."external_id", "T1"."timestamp"
FROM "T1"
WHERE "T1"."external_id" IN ('825889935', '825904511')
ORDER BY "T1"."external_id" ASC, "T1"."timestamp" DESC
解释:
Unique (cost=169123.13..169123.19 rows=12 width=18) (actual time=1327.443..1334.118 rows=2 loops=1)
-> Sort (cost=169123.13..169123.16 rows=12 width=18) (actual time=1327.441..1334.112 rows=2 loops=1)
Sort Key: external_id, timestamp DESC
Sort Method: quicksort Memory: 25kB
-> Gather (cost=1000.00..169122.91 rows=12 width=18) (actual time=752.577..1334.056 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on T1 (cost=0.00..168121.71 rows=5 width=18) (actual time=921.649..1300.556 rows=1 loops=3)
Filter: ((external_id)::text = ANY ('{825889935,825904511}'::text[]))
Rows Removed by Filter: 1168882
Planning Time: 0.592 ms
Execution Time: 1334.159 ms
我该怎么做才能使这个查询更快?或者我应该使用完全不同的查询?
更新:
按照@jahrl 的要求添加了新的查询计划。看起来查询速度更快,但之前的查询计划是在负载下制定的,现在它的工作时间差不多
Finalize GroupAggregate (cost=169121.80..169123.21 rows=12 width=18) (actual time=321.009..322.410 rows=2 loops=1)
Group Key: external_id
-> Gather Merge (cost=169121.80..169123.04 rows=10 width=18) (actual time=321.003..322.403 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial GroupAggregate (cost=168121.77..168121.86 rows=5 width=18) (actual time=318.671..318.672 rows=1 loops=3)
Group Key: external_id
-> Sort (cost=168121.77..168121.78 rows=5 width=18) (actual time=318.664..318.665 rows=1 loops=3)
Sort Key: external_id
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Parallel Seq Scan on T1 (cost=0.00..168121.71 rows=5 width=18) (actual time=144.338..318.611 rows=1 loops=3)
Filter: ((external_id)::text = ANY ('{825889935,825904511}'::text[]))
Rows Removed by Filter: 1170827
Planning Time: 0.093 ms
Execution Time: 322.441 ms
也许基本的 GROUP BY
查询会执行得更好?
SELECT "T1"."external_id", MAX("T1"."timestamp") as "timestamp"
FROM "T1"
WHERE "T1"."external_id" IN ('825889935', '825904511')
GROUP BY "T1"."external_id"
ORDER BY "T1"."external_id" ASC
而且,正如@melcher 所说,不要忘记 ("external_id", "timestamp") 索引!
查看过滤器删除的行数并在 external_id
上创建索引。