EXPLAIN ANALYZE in PostgreSQL:对查询性能的影响?
EXPLAIN ANALYZE in PostgreSQL : impact on the query performance?
每次我 运行 对 PostgreSQL 中的查询进行 EXPLAIN ANALYZE
时,Execution time
就会减少。为什么?
我需要在 table 上做一些索引,这样我就无法确定我的操作是否会提高性能。你有什么推荐给我的?
连续执行查询解释(分析,缓冲区)的结果示例:
my_db=# explain (analyze, buffers)
SELECT COUNT(*) AS count
FROM my_view
WHERE creation_time >= '2019-01-18 00:00:00'
AND creation_time <= '2019-01-18 09:43:36'
ORDER BY count DESC
LIMIT 10000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------
Limit (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=24380.613..24380.614 rows=1 loops=1)
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Sort (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=24380.611..24380.611 rows=1 loops=1)
Sort Key: (count(*)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Aggregate (cost=2568854.21..2568854.22 rows=1 width=8) (actual time=24380.599..24380.599 rows=1 loops=1)
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> GroupAggregate (cost=2568854.18..2568854.20 rows=1 width=28) (actual time=24339.455..24380.589 rows=40 loops=1)
Group Key: my_table.creation_time, my_table.some_field
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Sort (cost=2568854.18..2568854.18 rows=1 width=12) (actual time=24338.361..24357.171 rows=199309 loops=1)
Sort Key: my_table.creation_time, my_table.some_field
Sort Method: external merge Disk: 4496kB
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Gather (cost=1000.00..2568854.17 rows=1 width=12) (actual time=23799.237..24142.217 rows=199309 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=14915 read=2052993
-> Parallel Seq Scan on my_table (cost=0.00..2567854.07 rows=1 width=12) (actual time=23796.695..24087.574 rows=66436 loops=3)
Filter: (creation_time >= '2019-01-18 00:00:00+00'::timestamp with time zone) AND (creation_time <= '2019-01-18 09:43:36+00'::timestamp with time zone)
Rows Removed by Filter: 21818095
Buffers: shared hit=14915 read=2052993
Planning time: 10.982 ms
Execution time: 24381.544 ms
(25 rows)
my_db=# explain (analyze, buffers)
SELECT COUNT(*) AS count
FROM my_view
WHERE creation_time >= '2019-01-18 00:00:00'
AND creation_time <= '2019-01-18 09:43:36'
ORDER BY count DESC
LIMIT 10000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------
Limit (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=6836.247..6836.248 rows=1 loops=1)
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Sort (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=6836.245..6836.246 rows=1 loops=1)
Sort Key: (count(*)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Aggregate (cost=2568854.21..2568854.22 rows=1 width=8) (actual time=6836.232..6836.232 rows=1 loops=1)
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> GroupAggregate (cost=2568854.18..2568854.20 rows=1 width=28) (actual time=6792.036..6836.221 rows=40 loops=1)
Group Key: my_table.creation_time, my_table.some_field
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Sort (cost=2568854.18..2568854.18 rows=1 width=12) (actual time=6790.807..6811.469 rows=199309 loops=1)
Sort Key: my_table.creation_time, my_table.some_field
Sort Method: external merge Disk: 4496kB
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Gather (cost=1000.00..2568854.17 rows=1 width=12) (actual time=6271.571..6604.946 rows=199309 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15181 read=2052727
-> Parallel Seq Scan on my_table (cost=0.00..2567854.07 rows=1 width=12) (actual time=6268.383..6529.416 rows=66436 loops=3)
Filter: (creation_time >= '2019-01-18 00:00:00+00'::timestamp with time zone) AND (creation_time <= '2019-01-18 09:43:36+00'::timestamp with time zone)
Rows Removed by Filter: 21818095
Buffers: shared hit=15181 read=2052727
Planning time: 0.570 ms
Execution time: 6837.137 ms
(25 rows)
谢谢。
这是什么版本的 PostgreSQL?
有两个问题:
缺少索引:
CREATE INDEX ON my_table (creation_time);
creation
时间的值分布统计有偏差,导致PostgreSQL严重低估结果行数
这不是您问题的直接原因,但您应该调查那里发生的事情:
如果 ANALYZE my_table
改进了估计值,请参阅自动分析针对 table 更频繁地运行。
如果 ANALYZE my_table
没有帮助,请收集更好的统计信息:
ALTER TABLE my_table ALTER creation_time SET STATISTICS 1000;
ANALYZE my_table;
您观察到 EXPLAIN (ANALYZE)
的执行时间减少可能是由于缓存效应。
每次我 运行 对 PostgreSQL 中的查询进行 EXPLAIN ANALYZE
时,Execution time
就会减少。为什么?
我需要在 table 上做一些索引,这样我就无法确定我的操作是否会提高性能。你有什么推荐给我的?
连续执行查询解释(分析,缓冲区)的结果示例:
my_db=# explain (analyze, buffers)
SELECT COUNT(*) AS count
FROM my_view
WHERE creation_time >= '2019-01-18 00:00:00'
AND creation_time <= '2019-01-18 09:43:36'
ORDER BY count DESC
LIMIT 10000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------
Limit (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=24380.613..24380.614 rows=1 loops=1)
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Sort (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=24380.611..24380.611 rows=1 loops=1)
Sort Key: (count(*)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Aggregate (cost=2568854.21..2568854.22 rows=1 width=8) (actual time=24380.599..24380.599 rows=1 loops=1)
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> GroupAggregate (cost=2568854.18..2568854.20 rows=1 width=28) (actual time=24339.455..24380.589 rows=40 loops=1)
Group Key: my_table.creation_time, my_table.some_field
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Sort (cost=2568854.18..2568854.18 rows=1 width=12) (actual time=24338.361..24357.171 rows=199309 loops=1)
Sort Key: my_table.creation_time, my_table.some_field
Sort Method: external merge Disk: 4496kB
Buffers: shared hit=14915 read=2052993, temp read=562 written=563
-> Gather (cost=1000.00..2568854.17 rows=1 width=12) (actual time=23799.237..24142.217 rows=199309 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=14915 read=2052993
-> Parallel Seq Scan on my_table (cost=0.00..2567854.07 rows=1 width=12) (actual time=23796.695..24087.574 rows=66436 loops=3)
Filter: (creation_time >= '2019-01-18 00:00:00+00'::timestamp with time zone) AND (creation_time <= '2019-01-18 09:43:36+00'::timestamp with time zone)
Rows Removed by Filter: 21818095
Buffers: shared hit=14915 read=2052993
Planning time: 10.982 ms
Execution time: 24381.544 ms
(25 rows)
my_db=# explain (analyze, buffers)
SELECT COUNT(*) AS count
FROM my_view
WHERE creation_time >= '2019-01-18 00:00:00'
AND creation_time <= '2019-01-18 09:43:36'
ORDER BY count DESC
LIMIT 10000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------
Limit (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=6836.247..6836.248 rows=1 loops=1)
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Sort (cost=2568854.23..2568854.23 rows=1 width=8) (actual time=6836.245..6836.246 rows=1 loops=1)
Sort Key: (count(*)) DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Aggregate (cost=2568854.21..2568854.22 rows=1 width=8) (actual time=6836.232..6836.232 rows=1 loops=1)
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> GroupAggregate (cost=2568854.18..2568854.20 rows=1 width=28) (actual time=6792.036..6836.221 rows=40 loops=1)
Group Key: my_table.creation_time, my_table.some_field
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Sort (cost=2568854.18..2568854.18 rows=1 width=12) (actual time=6790.807..6811.469 rows=199309 loops=1)
Sort Key: my_table.creation_time, my_table.some_field
Sort Method: external merge Disk: 4496kB
Buffers: shared hit=15181 read=2052727, temp read=562 written=563
-> Gather (cost=1000.00..2568854.17 rows=1 width=12) (actual time=6271.571..6604.946 rows=199309 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15181 read=2052727
-> Parallel Seq Scan on my_table (cost=0.00..2567854.07 rows=1 width=12) (actual time=6268.383..6529.416 rows=66436 loops=3)
Filter: (creation_time >= '2019-01-18 00:00:00+00'::timestamp with time zone) AND (creation_time <= '2019-01-18 09:43:36+00'::timestamp with time zone)
Rows Removed by Filter: 21818095
Buffers: shared hit=15181 read=2052727
Planning time: 0.570 ms
Execution time: 6837.137 ms
(25 rows)
谢谢。
这是什么版本的 PostgreSQL?
有两个问题:
缺少索引:
CREATE INDEX ON my_table (creation_time);
creation
时间的值分布统计有偏差,导致PostgreSQL严重低估结果行数这不是您问题的直接原因,但您应该调查那里发生的事情:
如果
ANALYZE my_table
改进了估计值,请参阅自动分析针对 table 更频繁地运行。如果
ANALYZE my_table
没有帮助,请收集更好的统计信息:ALTER TABLE my_table ALTER creation_time SET STATISTICS 1000; ANALYZE my_table;
您观察到 EXPLAIN (ANALYZE)
的执行时间减少可能是由于缓存效应。