带 "order by" 子句的 PostgreSQL CURSOR
PostgreSQL CURSOR with "order by" clause
让我们假设有一个名为 A
的查询,它需要 2 秒。
SELECT ... FROM ... ORDER BY "users_device"."id" # Query A
# It contains join clause.
# It takes 2sec
但是,当我 运行 A
声明 CURSOR
时,需要 8 秒。
DECLARE "cursor" NO SCROLL CURSOR WITH HOLD FOR SELECT ... FROM ... ORDER BY "users_device"."id"
# It takes 8sec
我尝试比较它们之间的查询计划然后我发现 A
和 CURSOR
似乎试图避免排序操作。
下面是实际的查询计划。
# A without CURSOR
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
| QUERY PLAN
|--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Sort (cost=433944.70..434050.91 rows=42485 width=147) (actual time=2664.192..2669.168 rows=35949 loops=1)
| Sort Key: users_device.id DESC
| Sort Method: external merge Disk: 5536kB
| -> Nested Loop (cost=239036.59..427483.24 rows=42485 width=147) (actual time=1956.219..2631.077 rows=35949 loops=1)
| -> Nested Loop (cost=239036.16..404723.17 rows=43069 width=151) (actual time=1956.209..2502.529 rows=39556 loops=1)
| -> Hash Join (cost=239035.73..367340.95 rows=51402 width=12) (actual time=1956.192..2249.085 rows=63844 loops=1)
| Hash Cond: (users_serviceuser_favorites.from_serviceuser_id = users_serviceuser.id)
| -> Bitmap Heap Scan on users_serviceuser_favorites (cost=1988.30..119110.71 rows=72756 width=4) (actual time=22.182..74.569 rows=66736 lo
| Recheck Cond: (to_serviceuser_id = 773433)
| Heap Blocks: exact=43597
| -> Bitmap Index Scan on users_serviceuser_favorites_011e5c87 (cost=0.00..1970.11 rows=72756 width=0) (actual time=13.108..13.108 ro
| Index Cond: (to_serviceuser_id = 773433)
| -> Hash (cost=196162.09..196162.09 rows=2492028 width=8) (actual time=1932.025..1932.025 rows=2503575 loops=1)
| Buckets: 131072 Batches: 64 Memory Usage: 2564kB
| -> Seq Scan on users_serviceuser (cost=0.00..196162.09 rows=2492028 width=8) (actual time=0.184..1517.721 rows=2503575 loops=1)
| Filter: (status = 1)
| Rows Removed by Filter: 1016611
| -> Index Scan using users_device_e8701ad4 on users_device (cost=0.43..0.72 rows=1 width=151) (actual time=0.003..0.004 rows=1 loops=63844)
| Index Cond: (user_id = users_serviceuser.id)
| Filter: (status = 0)
| Rows Removed by Filter: 1
| -> Index Scan using users_pushsetting_pkey on users_pushsetting (cost=0.43..0.52 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=39556)
| Index Cond: (id = users_serviceuser.push_settings_id)
| Filter: live
| Rows Removed by Filter: 0
| Planning time: 2.537 ms
| Execution time: 2671.895 ms
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
# A with CURSOR (You can see there is no another sorting operation
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
| QUERY PLAN
|--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Nested Loop (cost=1.85..3204962.54 rows=42484 width=147) (actual time=0.324..8345.880 rows=35945 loops=1)
| -> Nested Loop (cost=1.42..3182203.00 rows=43068 width=151) (actual time=0.314..8206.619 rows=39552 loops=1)
| -> Nested Loop (cost=0.99..3124567.60 rows=84403 width=155) (actual time=0.302..8035.916 rows=43584 loops=1)
| -> Index Scan Backward using users_device_pkey on users_device (cost=0.43..423536.24 rows=2955421 width=151) (actual time=0.017..2659.130 row
| Filter: (status = 0)
| Rows Removed by Filter: 692112
| -> Index Only Scan using users_serviceuser_favorites_from_serviceuser_id_ac0a7b1d_uniq on users_serviceuser_favorites (cost=0.56..0.89 rows=2 width
| Index Cond: ((from_serviceuser_id = users_device.user_id) AND (to_serviceuser_id = 773433))
| Heap Fetches: 38956
| -> Index Scan using users_serviceuser_pkey on users_serviceuser (cost=0.43..0.67 rows=1 width=8) (actual time=0.003..0.004 rows=1 loops=43584)
| Index Cond: (id = users_device.user_id)
| Filter: (status = 1)
| Rows Removed by Filter: 0
| -> Index Scan using users_pushsetting_pkey on users_pushsetting (cost=0.43..0.52 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=39552)
| Index Cond: (id = users_serviceuser.push_settings_id)
| Filter: live
| Rows Removed by Filter: 0
| Planning time: 4.056 ms
| Execution time: 8348.095 ms
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
当声明CURSOR
时,优化器首先选择"users_device"."id"(users_device_pkey)
索引扫描,以便所有加入的table按"users_device"."id"
排序,而无需其他排序操作。因此,即使此选择导致查询性能不佳,它也不需要稍后执行 order by "users_device"."id"
。
为什么优化器 CURSOR
选择不同的计划?
是否避免排序?
如果是,为什么?
我在文档中找不到这个,但我推测,当您使用游标时,数据库看起来更多的是估计启动成本(输出第一行的时间)而不是估计总成本(最后一行输出的时间)。
在您的示例中,慢速计划估计以 1.85 个成本单位输出第一行,而快速计划以 433944.70 个成本单位输出。因此,当您使用游标时,数据库似乎更喜欢慢速计划,以便能够尽快提供部分结果。
这似乎是合理的 - 您使用游标而不是普通查询,可能是因为您希望尽快开始处理数据。
我认为您可以通过显式创建临时 table:
create temporary table t as select ... /* skip order by */;
declare c cursor with hold for select * from t order by id;
如@mastaBlasta pointed out in a comment there's an option that controls for how much is first result preferred over a whole result for cursors: cursor_tuple_fraction.
让我们假设有一个名为 A
的查询,它需要 2 秒。
SELECT ... FROM ... ORDER BY "users_device"."id" # Query A
# It contains join clause.
# It takes 2sec
但是,当我 运行 A
声明 CURSOR
时,需要 8 秒。
DECLARE "cursor" NO SCROLL CURSOR WITH HOLD FOR SELECT ... FROM ... ORDER BY "users_device"."id"
# It takes 8sec
我尝试比较它们之间的查询计划然后我发现 A
和 CURSOR
似乎试图避免排序操作。
下面是实际的查询计划。
# A without CURSOR
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
| QUERY PLAN
|--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Sort (cost=433944.70..434050.91 rows=42485 width=147) (actual time=2664.192..2669.168 rows=35949 loops=1)
| Sort Key: users_device.id DESC
| Sort Method: external merge Disk: 5536kB
| -> Nested Loop (cost=239036.59..427483.24 rows=42485 width=147) (actual time=1956.219..2631.077 rows=35949 loops=1)
| -> Nested Loop (cost=239036.16..404723.17 rows=43069 width=151) (actual time=1956.209..2502.529 rows=39556 loops=1)
| -> Hash Join (cost=239035.73..367340.95 rows=51402 width=12) (actual time=1956.192..2249.085 rows=63844 loops=1)
| Hash Cond: (users_serviceuser_favorites.from_serviceuser_id = users_serviceuser.id)
| -> Bitmap Heap Scan on users_serviceuser_favorites (cost=1988.30..119110.71 rows=72756 width=4) (actual time=22.182..74.569 rows=66736 lo
| Recheck Cond: (to_serviceuser_id = 773433)
| Heap Blocks: exact=43597
| -> Bitmap Index Scan on users_serviceuser_favorites_011e5c87 (cost=0.00..1970.11 rows=72756 width=0) (actual time=13.108..13.108 ro
| Index Cond: (to_serviceuser_id = 773433)
| -> Hash (cost=196162.09..196162.09 rows=2492028 width=8) (actual time=1932.025..1932.025 rows=2503575 loops=1)
| Buckets: 131072 Batches: 64 Memory Usage: 2564kB
| -> Seq Scan on users_serviceuser (cost=0.00..196162.09 rows=2492028 width=8) (actual time=0.184..1517.721 rows=2503575 loops=1)
| Filter: (status = 1)
| Rows Removed by Filter: 1016611
| -> Index Scan using users_device_e8701ad4 on users_device (cost=0.43..0.72 rows=1 width=151) (actual time=0.003..0.004 rows=1 loops=63844)
| Index Cond: (user_id = users_serviceuser.id)
| Filter: (status = 0)
| Rows Removed by Filter: 1
| -> Index Scan using users_pushsetting_pkey on users_pushsetting (cost=0.43..0.52 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=39556)
| Index Cond: (id = users_serviceuser.push_settings_id)
| Filter: live
| Rows Removed by Filter: 0
| Planning time: 2.537 ms
| Execution time: 2671.895 ms
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
# A with CURSOR (You can see there is no another sorting operation
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
| QUERY PLAN
|--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Nested Loop (cost=1.85..3204962.54 rows=42484 width=147) (actual time=0.324..8345.880 rows=35945 loops=1)
| -> Nested Loop (cost=1.42..3182203.00 rows=43068 width=151) (actual time=0.314..8206.619 rows=39552 loops=1)
| -> Nested Loop (cost=0.99..3124567.60 rows=84403 width=155) (actual time=0.302..8035.916 rows=43584 loops=1)
| -> Index Scan Backward using users_device_pkey on users_device (cost=0.43..423536.24 rows=2955421 width=151) (actual time=0.017..2659.130 row
| Filter: (status = 0)
| Rows Removed by Filter: 692112
| -> Index Only Scan using users_serviceuser_favorites_from_serviceuser_id_ac0a7b1d_uniq on users_serviceuser_favorites (cost=0.56..0.89 rows=2 width
| Index Cond: ((from_serviceuser_id = users_device.user_id) AND (to_serviceuser_id = 773433))
| Heap Fetches: 38956
| -> Index Scan using users_serviceuser_pkey on users_serviceuser (cost=0.43..0.67 rows=1 width=8) (actual time=0.003..0.004 rows=1 loops=43584)
| Index Cond: (id = users_device.user_id)
| Filter: (status = 1)
| Rows Removed by Filter: 0
| -> Index Scan using users_pushsetting_pkey on users_pushsetting (cost=0.43..0.52 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=39552)
| Index Cond: (id = users_serviceuser.push_settings_id)
| Filter: live
| Rows Removed by Filter: 0
| Planning time: 4.056 ms
| Execution time: 8348.095 ms
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
当声明CURSOR
时,优化器首先选择"users_device"."id"(users_device_pkey)
索引扫描,以便所有加入的table按"users_device"."id"
排序,而无需其他排序操作。因此,即使此选择导致查询性能不佳,它也不需要稍后执行 order by "users_device"."id"
。
为什么优化器 CURSOR
选择不同的计划?
是否避免排序?
如果是,为什么?
我在文档中找不到这个,但我推测,当您使用游标时,数据库看起来更多的是估计启动成本(输出第一行的时间)而不是估计总成本(最后一行输出的时间)。
在您的示例中,慢速计划估计以 1.85 个成本单位输出第一行,而快速计划以 433944.70 个成本单位输出。因此,当您使用游标时,数据库似乎更喜欢慢速计划,以便能够尽快提供部分结果。
这似乎是合理的 - 您使用游标而不是普通查询,可能是因为您希望尽快开始处理数据。
我认为您可以通过显式创建临时 table:
create temporary table t as select ... /* skip order by */;
declare c cursor with hold for select * from t order by id;
如@mastaBlasta pointed out in a comment there's an option that controls for how much is first result preferred over a whole result for cursors: cursor_tuple_fraction.