postgresql查询性能增强
postgresql query performance enhancement
我正在尝试获得最高人气的行。按受欢迎程度降序排序会显着减慢查询速度。
有没有更好的方法来优化这个查询?
PostgreSQL - 9.5
```explain analyse SELECT v.cosmo_id,
v.resource_id, k.gid, k.popularity,v.cropinfo_id
FROM rmsg.verifications V INNER JOIN rmip.resourceinfo R ON
(R.id=V.resource_id AND R.source_id=54) INNER JOIN rmpp.kgidinfo K ON
(K.cosmo_id=V.cosmo_id) WHERE V.status=1 AND
v.crop_Status=1 AND V.locked_time isnull ORDER BY k.popularity
desc, (v.cosmo_id,
v.resource_id, v.cropinfo_id) LIMIT 1;```
QUERY PLAN
Limit (cost=470399.99..470399.99 rows=1 width=31) (actual time=19655.552..19655.553 rows=1 loops=1)
Sort (cost=470399.99..470434.80 rows=13923 width=31) (actual time=19655.549..19655.549 rows=1 loops=1)
Sort Key: k.popularity DESC, (ROW(v.cosmo_id, v.resource_id, v.cropinfo_id))
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=19053.91..470330.37 rows=13923 width=31) (actual time=58.365..19627.405 rows=23006 loops=1)
-> Hash Join (cost=19053.48..459008.74 rows=13188 width=16) (actual time=58.275..19268.339 rows=19165 loops=1)
Hash Cond: (v.resource_id = r.id)
-> Seq Scan on verifications v (cost=0.00..409876.92 rows=7985725 width=16) (actual time=0.035..11097.163 rows=9908140 loops=1)
Filter: ((locked_time IS NULL) AND (status = 1) AND (crop_status = 1))
Rows Removed by Filter: 1126121
-> Hash (cost=18984.23..18984.23 rows=5540 width=4) (actual time=57.101..57.101 rows=5186 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 247kB
-> Bitmap Heap Scan on resourceinfo r (cost=175.37..18984.23 rows=5540 width=4) (actual time=2.827..51.318 rows=5186 loops=1)
Recheck Cond: (source_id = 54)
Heap Blocks: exact=5907
-> Bitmap Index Scan on resourceinfo_source_id_key (cost=0.00..173.98 rows=5540 width=0) (actual time=1.742..1.742 rows=6483 loops=1)
Index Cond: (source_id = 54)
Index Scan using kgidinfo_cosmo_id_idx on kgidinfo k (cost=0.43..0.85 rows=1 width=23) (actual time=0.013..0.014 rows=1 loops=19165)
Index Cond: (cosmo_id = v.cosmo_id)
Planning time: 1.083 ms
Execution time: 19655.638 ms
(21 rows)```
这是您的查询,已通过删除括号进行简化:
SELECT v.cosmo_id, v.resource_id, k.gid, k.popularity, v.cropinfo_id
FROM rmsg.verifications V INNER JOIN
rmip.resourceinfo R
ON R.id = V.resource_id AND R.source_id = 54 INNER JOIN
rmpp.kgidinfo K
ON K.cosmo_id = V.cosmo_id
WHERE V.status = 1 AND v.crop_Status = 1 AND
V.locked_time is null
ORDER BY k.popularity desc, v.cosmo_id, v.resource_id, v.cropinfo_id
LIMIT 1;
对于这个查询,我会考虑 verifications(status, crop_status, locked_time, resource_id, cosmo_id, crop_info_id)
、resourceinfo(id, source_id)
和 kgidinfo(cosmo_id)
上的索引。我没有看到删除 ORDER BY
.
的简单方法
在看查询的过程中,不知你是不是两个表之间存在笛卡尔积问题
我正在尝试获得最高人气的行。按受欢迎程度降序排序会显着减慢查询速度。 有没有更好的方法来优化这个查询?
PostgreSQL - 9.5
```explain analyse SELECT v.cosmo_id,
v.resource_id, k.gid, k.popularity,v.cropinfo_id
FROM rmsg.verifications V INNER JOIN rmip.resourceinfo R ON
(R.id=V.resource_id AND R.source_id=54) INNER JOIN rmpp.kgidinfo K ON
(K.cosmo_id=V.cosmo_id) WHERE V.status=1 AND
v.crop_Status=1 AND V.locked_time isnull ORDER BY k.popularity
desc, (v.cosmo_id,
v.resource_id, v.cropinfo_id) LIMIT 1;```
QUERY PLAN
Limit (cost=470399.99..470399.99 rows=1 width=31) (actual time=19655.552..19655.553 rows=1 loops=1)
Sort (cost=470399.99..470434.80 rows=13923 width=31) (actual time=19655.549..19655.549 rows=1 loops=1)
Sort Key: k.popularity DESC, (ROW(v.cosmo_id, v.resource_id, v.cropinfo_id))
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=19053.91..470330.37 rows=13923 width=31) (actual time=58.365..19627.405 rows=23006 loops=1)
-> Hash Join (cost=19053.48..459008.74 rows=13188 width=16) (actual time=58.275..19268.339 rows=19165 loops=1)
Hash Cond: (v.resource_id = r.id)
-> Seq Scan on verifications v (cost=0.00..409876.92 rows=7985725 width=16) (actual time=0.035..11097.163 rows=9908140 loops=1)
Filter: ((locked_time IS NULL) AND (status = 1) AND (crop_status = 1))
Rows Removed by Filter: 1126121
-> Hash (cost=18984.23..18984.23 rows=5540 width=4) (actual time=57.101..57.101 rows=5186 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 247kB
-> Bitmap Heap Scan on resourceinfo r (cost=175.37..18984.23 rows=5540 width=4) (actual time=2.827..51.318 rows=5186 loops=1)
Recheck Cond: (source_id = 54)
Heap Blocks: exact=5907
-> Bitmap Index Scan on resourceinfo_source_id_key (cost=0.00..173.98 rows=5540 width=0) (actual time=1.742..1.742 rows=6483 loops=1)
Index Cond: (source_id = 54)
Index Scan using kgidinfo_cosmo_id_idx on kgidinfo k (cost=0.43..0.85 rows=1 width=23) (actual time=0.013..0.014 rows=1 loops=19165)
Index Cond: (cosmo_id = v.cosmo_id)
Planning time: 1.083 ms
Execution time: 19655.638 ms
(21 rows)```
这是您的查询,已通过删除括号进行简化:
SELECT v.cosmo_id, v.resource_id, k.gid, k.popularity, v.cropinfo_id
FROM rmsg.verifications V INNER JOIN
rmip.resourceinfo R
ON R.id = V.resource_id AND R.source_id = 54 INNER JOIN
rmpp.kgidinfo K
ON K.cosmo_id = V.cosmo_id
WHERE V.status = 1 AND v.crop_Status = 1 AND
V.locked_time is null
ORDER BY k.popularity desc, v.cosmo_id, v.resource_id, v.cropinfo_id
LIMIT 1;
对于这个查询,我会考虑 verifications(status, crop_status, locked_time, resource_id, cosmo_id, crop_info_id)
、resourceinfo(id, source_id)
和 kgidinfo(cosmo_id)
上的索引。我没有看到删除 ORDER BY
.
在看查询的过程中,不知你是不是两个表之间存在笛卡尔积问题