使用索引查询速度很慢——如何理解执行计划？

Question

我需要帮助来理解为什么当我使用索引时我的查询比没有任何索引时要慢。我运行解释分析命令，下面是执行计划选项 1 - 带索引，选项 2 - 不带索引。

有人可以向我解释为什么索引会使这些执行计划中的性能变差吗？

PS。当我向 table（原始大小 2M）添加 1000 万行时，情况正在转向有利于索引，在这种情况下，使用索引的查询速度提高了 3 倍）。

选项 1 WITH INDEX FOR LEFT JOIN invoice_id+acct_level ON TABLE cost_invoice_facepage AND CONDITION (cdb.invoice_id = invoice_id) AND (acct_level = 1)

Append  (cost=48.87..38583.97 rows=163773 width=371) (actual time=1.269..1516.564 rows=379129 loops=1)
  ->  Nested Loop  (cost=48.87..10520.11 rows=36504 width=362) (actual time=1.268..5.986 rows=579 loops=1)
        ->  Hash Left Join  (cost=44.66..9918.22 rows=507 width=322) (actual time=1.160..5.497 rows=579 loops=1)
              Hash Cond: (cd.gl_string_id = gs.id)
              ->  Nested Loop Left Join  (cost=0.85..9873.07 rows=507 width=262) (actual time=0.485..4.473 rows=579 loops=1)
                    Filter: ((c.gl_rule_type IS NULL) OR ((cd.charge_id IS NOT NULL) AND (c.gl_rule_type_id <> ALL ('{60,70}'::integer[]))))
                    ->  Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c  (cost=0.43..1204.53 rows=1188 width=243) (actual time=0.467..2.664 rows=579 loops=1)
                          Index Cond: (invoice_id = 14517)
                          Filter: ((chg_amt <> '0'::numeric) AND ((gl_rule_type IS NULL) OR (gl_rule_type_id <> ALL ('{60,70}'::integer[]))))
                          Rows Removed by Filter: 3364
                    ->  Index Scan using "gl_charge_detail.charge_id->cost_invoice_info_only.id" on gl_charge_detail cd  (cost=0.42..7.28 rows=1 width=27) (actual time=0.002..0.002 rows=1 loops=579)
                          Index Cond: (c.id = charge_id)
              ->  Hash  (cost=31.69..31.69 rows=969 width=64) (actual time=0.657..0.657 rows=969 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 103kB
                    ->  Seq Scan on gl_strings gs  (cost=0.00..31.69 rows=969 width=64) (actual time=0.026..0.389 rows=969 loops=1)
        ->  Materialize  (cost=4.22..145.78 rows=72 width=44) (actual time=0.000..0.000 rows=1 loops=579)
              ->  Hash Left Join  (cost=4.22..145.42 rows=72 width=44) (actual time=0.100..0.102 rows=1 loops=1)
                    Hash Cond: (f.vendor_id = vn.id)
                    ->  Nested Loop  (cost=0.57..141.57 rows=72 width=31) (actual time=0.027..0.029 rows=1 loops=1)
                          ->  Index Scan using cost_invoice_header_id_idx on cost_invoice_header ch  (cost=0.29..8.31 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1)
                                Index Cond: (id = 14517)
                                Filter: (status_code <> ALL ('{100,101,102,490}'::integer[]))
                          ->  Index Scan using "invoice_id+acct_level" on cost_invoice_facepage f  (cost=0.29..132.55 rows=72 width=31) (actual time=0.013..0.013 rows=1 loops=1)
                                Index Cond: ((invoice_id = 14517) AND (acct_level = 1))
                    ->  Hash  (cost=2.73..2.73 rows=73 width=17) (actual time=0.061..0.061 rows=73 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 12kB
                          ->  Seq Scan on appdata_vendor_common vn  (cost=0.00..2.73 rows=73 width=17) (actual time=0.020..0.038 rows=73 loops=1)
  ->  Hash Left Join  (cost=2276.48..25607.26 rows=127269 width=374) (actual time=204.117..1486.717 rows=378550 loops=1)
        Hash Cond: (f_1.vendor_id = vn_1.id)
        ->  Nested Loop Left Join  (cost=2272.84..25250.14 rows=127269 width=361) (actual time=204.072..1328.491 rows=378550 loops=1)
              ->  Gather  (cost=2272.55..24385.32 rows=2663 width=338) (actual time=204.055..335.965 rows=378550 loops=1)
                    Workers Planned: 2
                    Workers Launched: 2
                    ->  Hash Left Join  (cost=1272.55..23119.02 rows=1110 width=338) (actual time=127.365..321.126 rows=126183 loops=3)
                          Hash Cond: (cdb.gl_string_id = gs_1.id)
                          ->  Hash Join  (cost=1228.74..23072.30 rows=1110 width=278) (actual time=126.126..263.315 rows=126183 loops=3)
                                Hash Cond: (cdb.charge_id = c_1.id)
                                ->  Parallel Seq Scan on gl_charge_detail_ban cdb  (cost=0.00..20581.15 rows=480915 width=43) (actual time=0.270..109.543 rows=384732 loops=3)
                                ->  Hash  (cost=1194.13..1194.13 rows=2769 width=239) (actual time=7.232..7.232 rows=3929 loops=3)
                                      Buckets: 4096  Batches: 1  Memory Usage: 635kB
                                      ->  Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c_1  (cost=0.43..1194.13 rows=2769 width=239) (actual time=0.070..4.686 rows=3929 loops=3)
                                            Index Cond: (invoice_id = 14517)
                                            Filter: (chg_amt <> '0'::numeric)
                                            Rows Removed by Filter: 14
                          ->  Hash  (cost=31.69..31.69 rows=969 width=64) (actual time=1.127..1.127 rows=969 loops=3)
                                Buckets: 1024  Batches: 1  Memory Usage: 103kB
                                ->  Seq Scan on gl_strings gs_1  (cost=0.00..31.69 rows=969 width=64) (actual time=0.165..0.714 rows=969 loops=3)
              ->  Index Scan using "invoice_id+acct_level" on cost_invoice_facepage f_1  (cost=0.29..0.31 rows=1 width=31) (actual time=0.001..0.002 rows=1 loops=378550)
                    Index Cond: ((cdb.invoice_id = invoice_id) AND (acct_level = 1))
        ->  Hash  (cost=2.73..2.73 rows=73 width=17) (actual time=0.035..0.035 rows=73 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 12kB
              ->  Seq Scan on appdata_vendor_common vn_1  (cost=0.00..2.73 rows=73 width=17) (actual time=0.014..0.021 rows=73 loops=1)
Planning Time: 3.636 ms
Execution Time: 1550.844 ms

和没有索引的选项 2

Append  (cost=48.58..43257.20 rows=163773 width=371) (actual time=7.965..831.408 rows=379129 loops=1)
  ->  Nested Loop  (cost=48.58..12251.68 rows=36504 width=362) (actual time=7.965..14.476 rows=579 loops=1)
        ->  Hash Left Join  (cost=44.66..9918.22 rows=507 width=322) (actual time=0.588..6.245 rows=579 loops=1)
              Hash Cond: (cd.gl_string_id = gs.id)
              ->  Nested Loop Left Join  (cost=0.85..9873.07 rows=507 width=262) (actual time=0.245..5.442 rows=579 loops=1)
                    Filter: ((c.gl_rule_type IS NULL) OR ((cd.charge_id IS NOT NULL) AND (c.gl_rule_type_id <> ALL ('{60,70}'::integer[]))))
                    ->  Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c  (cost=0.43..1204.53 rows=1188 width=243) (actual time=0.231..3.003 rows=579 loops=1)
                          Index Cond: (invoice_id = 14517)
                          Filter: ((chg_amt <> '0'::numeric) AND ((gl_rule_type IS NULL) OR (gl_rule_type_id <> ALL ('{60,70}'::integer[]))))
                          Rows Removed by Filter: 3364
                    ->  Index Scan using "gl_charge_detail.charge_id->cost_invoice_info_only.id" on gl_charge_detail cd  (cost=0.42..7.28 rows=1 width=27) (actual time=0.003..0.003 rows=1 loops=579)
                          Index Cond: (c.id = charge_id)
              ->  Hash  (cost=31.69..31.69 rows=969 width=64) (actual time=0.331..0.331 rows=969 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 103kB
                    ->  Seq Scan on gl_strings gs  (cost=0.00..31.69 rows=969 width=64) (actual time=0.017..0.183 rows=969 loops=1)
        ->  Materialize  (cost=3.93..1877.35 rows=72 width=44) (actual time=0.013..0.013 rows=1 loops=579)
              ->  Hash Left Join  (cost=3.93..1876.99 rows=72 width=44) (actual time=7.370..7.698 rows=1 loops=1)
                    Hash Cond: (f.vendor_id = vn.id)
                    ->  Nested Loop  (cost=0.29..1873.14 rows=72 width=31) (actual time=7.307..7.635 rows=1 loops=1)
                          ->  Index Scan using cost_invoice_header_id_idx on cost_invoice_header ch  (cost=0.29..8.31 rows=1 width=4) (actual time=0.011..0.013 rows=1 loops=1)
                                Index Cond: (id = 14517)
                                Filter: (status_code <> ALL ('{100,101,102,490}'::integer[]))
                          ->  Seq Scan on cost_invoice_facepage f  (cost=0.00..1864.12 rows=72 width=31) (actual time=7.293..7.619 rows=1 loops=1)
                                Filter: ((invoice_id = 14517) AND (acct_level = 1))
                                Rows Removed by Filter: 40340
                    ->  Hash  (cost=2.73..2.73 rows=73 width=17) (actual time=0.045..0.045 rows=73 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 12kB
                          ->  Seq Scan on appdata_vendor_common vn  (cost=0.00..2.73 rows=73 width=17) (actual time=0.022..0.028 rows=73 loops=1)
  ->  Hash Left Join  (cost=4248.29..28548.92 rows=127269 width=374) (actual time=234.692..789.334 rows=378550 loops=1)
        Hash Cond: (cdb.invoice_id = f_1.invoice_id)
        ->  Gather  (cost=2272.55..24385.32 rows=2663 width=338) (actual time=216.507..376.349 rows=378550 loops=1)
              Workers Planned: 2
              Workers Launched: 2
              ->  Hash Left Join  (cost=1272.55..23119.02 rows=1110 width=338) (actual time=128.932..389.669 rows=126183 loops=3)
                    Hash Cond: (cdb.gl_string_id = gs_1.id)
                    ->  Hash Join  (cost=1228.74..23072.30 rows=1110 width=278) (actual time=127.984..308.092 rows=126183 loops=3)
                          Hash Cond: (cdb.charge_id = c_1.id)
                          ->  Parallel Seq Scan on gl_charge_detail_ban cdb  (cost=0.00..20581.15 rows=480915 width=43) (actual time=0.163..117.001 rows=384732 loops=3)
                          ->  Hash  (cost=1194.13..1194.13 rows=2769 width=239) (actual time=8.779..8.779 rows=3929 loops=3)
                                Buckets: 4096  Batches: 1  Memory Usage: 635kB
                                ->  Index Scan using cost_invoice_charge_invoice_id_idx on cost_invoice_charge c_1  (cost=0.43..1194.13 rows=2769 width=239) (actual time=0.050..5.563 rows=3929 loops=3)
                                      Index Cond: (invoice_id = 14517)
                                      Filter: (chg_amt <> '0'::numeric)
                                      Rows Removed by Filter: 14
                    ->  Hash  (cost=31.69..31.69 rows=969 width=64) (actual time=0.829..0.829 rows=969 loops=3)
                          Buckets: 1024  Batches: 1  Memory Usage: 103kB
                          ->  Seq Scan on gl_strings gs_1  (cost=0.00..31.69 rows=969 width=64) (actual time=0.184..0.534 rows=969 loops=3)
        ->  Hash  (cost=1804.87..1804.87 rows=13670 width=44) (actual time=18.101..18.101 rows=13705 loops=1)
              Buckets: 16384  Batches: 1  Memory Usage: 1198kB
              ->  Hash Left Join  (cost=3.64..1804.87 rows=13670 width=44) (actual time=0.075..14.009 rows=13705 loops=1)
                    Hash Cond: (f_1.vendor_id = vn_1.id)
                    ->  Seq Scan on cost_invoice_facepage f_1  (cost=0.00..1763.26 rows=13670 width=31) (actual time=0.017..6.216 rows=13705 loops=1)
                          Filter: (acct_level = 1)
                          Rows Removed by Filter: 26636
                    ->  Hash  (cost=2.73..2.73 rows=73 width=17) (actual time=0.052..0.052 rows=73 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 12kB
                          ->  Seq Scan on appdata_vendor_common vn_1  (cost=0.00..2.73 rows=73 width=17) (actual time=0.013..0.027 rows=73 loops=1)
Planning Time: 3.365 ms
Execution Time: 863.941 ms

Answer 1

查看在索引扫描中驱动迭代的行：

Gather  (cost=2272.55..24385.32 rows=2663 width=338) (actual time=204.055..335.965 rows=378550 loops=1)

它认为索引扫描将迭代 2663 次（每个 invoice_id 的值不同）但它实际上迭代了 378550 次，（后一个数字是 'loops'索引扫描上的字段来自），相差 140 倍。每次访问索引时，都需要从根到叶 re-descend，边走边锁定和解锁页面。虽然这不是非常昂贵，但如果您执行 378550 次，它确实会加起来。将 table 批量处理成私有散列 table 会变得更快。但是由于估计的行数非常错误，PostgreSQL 在这种情况下没有意识到这一点。

使用索引查询速度很慢——如何理解执行计划？

Query is slow with indexes - how to understand execution plans?

postgresql

query-performance

database-indexes