为什么两个查询的联合比单个联合查询更快？

Question

我在 2x Google Cloud SQL Postgres 9.6 实例中使用 Autovacuuming 调试查询时间。暂存（无流量）7.5gb + 2vCPU。和生产：37.5gb 10vCPU。结果相同且令人困惑。

索引：

trade_user1
trade_user2

持续100-120ms:

SELECT * FROM "Trade" WHERE "user1" = 1
UNION
SELECT * FROM "Trade" WHERE "user2" = 1
LIMIT 24;

Limit  (cost=221.92..222.16 rows=24 width=1187) (actual time=0.115..0.124 rows=24 loops=1)
  ->  HashAggregate  (cost=221.92..222.46 rows=54 width=1187) (actual time=0.115..0.121 rows=24 loops=1)
        Group Key: id, status, user1, user2
        ->  Append  (cost=4.60..218.55 rows=54 width=1187) (actual time=0.024..0.076 rows=26 loops=1)
              ->  Bitmap Heap Scan on "Trade"  (cost=4.60..89.99 rows=22 width=155) (actual time=0.024..0.061 rows=23 loops=1)
                    Recheck Cond: (user1 = 1)
                    Heap Blocks: exact=20
                    ->  Bitmap Index Scan on trade_depositor_user_id  (cost=0.00..4.59 rows=22 width=0) (actual time=0.016..0.016 rows=23 loops=1)
                          Index Cond: (user1 = 1)
              ->  Bitmap Heap Scan on "Trade" "Trade_1"  (cost=4.67..128.02 rows=32 width=155) (actual time=0.011..0.014 rows=3 loops=1)
                    Recheck Cond: (user2 = 1)
                    Heap Blocks: exact=3
                    ->  Bitmap Index Scan on trade_withdrawer_user_id  (cost=0.00..4.67 rows=32 width=0) (actual time=0.009..0.009 rows=3 loops=1)
                          Index Cond: (user2 = 1)
Planning time: 0.224 ms
Execution time: 0.189 ms

持续 280-350 毫秒：

SELECT * FROM "Trade" WHERE "user1" = 1

Bitmap Heap Scan on "Trade"  (cost=4.60..89.99 rows=22 width=155) (actual time=0.023..0.054 rows=23 loops=1)
  Recheck Cond: (user1 = 1)
  Heap Blocks: exact=20
  ->  Bitmap Index Scan on trade_user1  (cost=0.00..4.59 rows=22 width=0) (actual time=0.015..0.015 rows=23 loops=1)
        Index Cond: (user2 = 1)
Planning time: 0.077 ms
Execution time: 0.078 ms

两个查询 return 结果集大小相等。我尝试了更简单查询的不同变体，例如按 ID ASC/DESC.

排序

Answer 1

您应该使用 UNION ALL，它只是连接结果，而不是 UNION，后者也对结果进行重复数据删除，通常是通过排序来制作公平比较。

也许优化器感到困惑，并且由于存在重复数据删除操作而使用了不同的启发式算法，并且小行数导致了没有意义的边缘情况。

Answer 2

我使用的时间似乎来自 PgAdmin，并且在通过 SSH 连接到实际数据库后 network/server 我可以看到两个变体之间的差异可以忽略不计，实际上与出现在EXPLAIN ANALYSE.

所以实际上执行 UNION 并不比单独查询更快/

为什么两个查询的联合比单个联合查询更快？

Why is a union of two queries faster than a single one of the unioned queries?

postgresql

google-cloud-sql

postgresql-9.6