PostgreSQL 查询计划中缺少 table 访问权限
Missing table access in PostgreSQL query plan
我有两个相同的表,一个有 1k 行,另一个有 1M 行。我使用以下脚本来填充它们。
CREATE TABLE Table1 (
id int NOT NULL primary key,
groupby int NOT NULL,
orderby int NOT NULL,
local_search int NOT NULL,
global_search int NOT NULL,
padding varchar(100) NOT NULL
);
CREATE TABLE Table2 (
id int NOT NULL primary key,
groupby int NOT NULL,
orderby int NOT NULL,
local_search int NOT NULL,
global_search int NOT NULL,
padding varchar(100) NOT NULL
);
INSERT
INTO Table1
WITH t1 AS
(
SELECT id
FROM generate_series(1, 10000) id
), t2 AS
(
SELECT id,
id % 100 groupby
FROM t1
), t3 AS
(
SELECT b.id, b.groupby, row_number() over (partition by groupby order by id) orderby
FROM t2 b
)
SELECT id,
groupby,
orderby,
orderby % 50 local_search,
id % 1000 global_search,
RPAD('Value ' || id || ' ' , 100, '*') as padding
FROM t3;
INSERT
INTO Table2
WITH t1 AS
(
SELECT id
FROM generate_series(1, 1000000) id
), t2 AS
(
SELECT id,
id % 100 groupby
FROM t1
), t3 AS
(
SELECT b.id, b.groupby, row_number() over (partition by groupby order by id) orderby
FROM t2 b
)
SELECT id,
groupby,
orderby,
orderby % 50 local_search,
id % 1000 global_search,
RPAD('Value ' || id || ' ' , 100, '*') as padding
FROM t3;
我还在表 2 上创建了二级索引
CREATE INDEX ix_Table2_groupby_orderby ON Table2 (groupby, orderby);
现在,我有以下查询
select b.id, b.groupby, b.orderby, b.local_search, b.global_search, b.padding
from Table2 b
join Table1 a on b.orderby = a.id
where a.global_search = 1 and b.groupby < 10;
使用 explain(analyze)
导致以下查询计划
"Nested Loop (cost=0.42..17787.05 rows=100 width=121) (actual time=0.056..34.722 rows=100 loops=1)"
" -> Seq Scan on table1 a (cost=0.00..318.00 rows=10 width=4) (actual time=0.033..1.313 rows=10 loops=1)"
" Filter: (global_search = 1)"
" Rows Removed by Filter: 9990"
" -> Index Scan using ix_table2_groupby_orderby on table2 b (cost=0.42..1746.81 rows=10 width=121) (actual time=0.159..3.337 rows=10 loops=10)"
" Index Cond: ((groupby < 10) AND (orderby = a.id))"
"Planning time: 0.296 ms"
"Execution time: 34.775 ms"
而我的问题是:为什么他不访问查询计划中的table2
?他只使用 ix_table2_groupby_orderby
,但它只包含 groupby
、orderby
和可能的 id
列。他如何得到Table2
的剩余列,为什么它不在查询计划中?
** 编辑 **
我已经按照@laurenzalbe 的建议尝试解释(详细)。这是结果
"Nested Loop (cost=0.42..17787.05 rows=100 width=121) (actual time=0.070..35.678 rows=100 loops=1)"
" Output: b.id, b.groupby, b.orderby, b.local_search, b.global_search, b.padding"
" -> Seq Scan on public.table1 a (cost=0.00..318.00 rows=10 width=4) (actual time=0.031..1.642 rows=10 loops=1)"
" Output: a.id, a.groupby, a.orderby, a.local_search, a.global_search, a.padding"
" Filter: (a.global_search = 1)"
" Rows Removed by Filter: 9990"
" -> Index Scan using ix_table2_groupby_orderby on public.table2 b (cost=0.42..1746.81 rows=10 width=121) (actual time=0.159..3.398 rows=10 loops=10)"
" Output: b.id, b.groupby, b.orderby, b.local_search, b.global_search, b.padding"
" Index Cond: ((b.groupby < 10) AND (b.orderby = a.id))"
"Planning time: 16.201 ms"
"Execution time: 35.754 ms"
其实我不是很明白为什么table2的heap没有访问权限,但是我接受了这个答案。
PostgreSQL 中的索引扫描不仅访问索引,还访问 table。这在执行计划中没有明确显示,并且是查明某行是否对事务可见所必需的。
尝试 EXPLAIN (VERBOSE)
查看返回的列。
All indexes in PostgreSQL are secondary indexes, meaning that each index is stored separately from the table's main data area (which is called the table's heap in PostgreSQL terminology). This means that in an ordinary index scan, each row retrieval requires fetching data from both the index and the heap.
我有两个相同的表,一个有 1k 行,另一个有 1M 行。我使用以下脚本来填充它们。
CREATE TABLE Table1 (
id int NOT NULL primary key,
groupby int NOT NULL,
orderby int NOT NULL,
local_search int NOT NULL,
global_search int NOT NULL,
padding varchar(100) NOT NULL
);
CREATE TABLE Table2 (
id int NOT NULL primary key,
groupby int NOT NULL,
orderby int NOT NULL,
local_search int NOT NULL,
global_search int NOT NULL,
padding varchar(100) NOT NULL
);
INSERT
INTO Table1
WITH t1 AS
(
SELECT id
FROM generate_series(1, 10000) id
), t2 AS
(
SELECT id,
id % 100 groupby
FROM t1
), t3 AS
(
SELECT b.id, b.groupby, row_number() over (partition by groupby order by id) orderby
FROM t2 b
)
SELECT id,
groupby,
orderby,
orderby % 50 local_search,
id % 1000 global_search,
RPAD('Value ' || id || ' ' , 100, '*') as padding
FROM t3;
INSERT
INTO Table2
WITH t1 AS
(
SELECT id
FROM generate_series(1, 1000000) id
), t2 AS
(
SELECT id,
id % 100 groupby
FROM t1
), t3 AS
(
SELECT b.id, b.groupby, row_number() over (partition by groupby order by id) orderby
FROM t2 b
)
SELECT id,
groupby,
orderby,
orderby % 50 local_search,
id % 1000 global_search,
RPAD('Value ' || id || ' ' , 100, '*') as padding
FROM t3;
我还在表 2 上创建了二级索引
CREATE INDEX ix_Table2_groupby_orderby ON Table2 (groupby, orderby);
现在,我有以下查询
select b.id, b.groupby, b.orderby, b.local_search, b.global_search, b.padding
from Table2 b
join Table1 a on b.orderby = a.id
where a.global_search = 1 and b.groupby < 10;
使用 explain(analyze)
导致以下查询计划"Nested Loop (cost=0.42..17787.05 rows=100 width=121) (actual time=0.056..34.722 rows=100 loops=1)"
" -> Seq Scan on table1 a (cost=0.00..318.00 rows=10 width=4) (actual time=0.033..1.313 rows=10 loops=1)"
" Filter: (global_search = 1)"
" Rows Removed by Filter: 9990"
" -> Index Scan using ix_table2_groupby_orderby on table2 b (cost=0.42..1746.81 rows=10 width=121) (actual time=0.159..3.337 rows=10 loops=10)"
" Index Cond: ((groupby < 10) AND (orderby = a.id))"
"Planning time: 0.296 ms"
"Execution time: 34.775 ms"
而我的问题是:为什么他不访问查询计划中的table2
?他只使用 ix_table2_groupby_orderby
,但它只包含 groupby
、orderby
和可能的 id
列。他如何得到Table2
的剩余列,为什么它不在查询计划中?
** 编辑 **
我已经按照@laurenzalbe 的建议尝试解释(详细)。这是结果
"Nested Loop (cost=0.42..17787.05 rows=100 width=121) (actual time=0.070..35.678 rows=100 loops=1)"
" Output: b.id, b.groupby, b.orderby, b.local_search, b.global_search, b.padding"
" -> Seq Scan on public.table1 a (cost=0.00..318.00 rows=10 width=4) (actual time=0.031..1.642 rows=10 loops=1)"
" Output: a.id, a.groupby, a.orderby, a.local_search, a.global_search, a.padding"
" Filter: (a.global_search = 1)"
" Rows Removed by Filter: 9990"
" -> Index Scan using ix_table2_groupby_orderby on public.table2 b (cost=0.42..1746.81 rows=10 width=121) (actual time=0.159..3.398 rows=10 loops=10)"
" Output: b.id, b.groupby, b.orderby, b.local_search, b.global_search, b.padding"
" Index Cond: ((b.groupby < 10) AND (b.orderby = a.id))"
"Planning time: 16.201 ms"
"Execution time: 35.754 ms"
其实我不是很明白为什么table2的heap没有访问权限,但是我接受了这个答案。
PostgreSQL 中的索引扫描不仅访问索引,还访问 table。这在执行计划中没有明确显示,并且是查明某行是否对事务可见所必需的。
尝试 EXPLAIN (VERBOSE)
查看返回的列。
All indexes in PostgreSQL are secondary indexes, meaning that each index is stored separately from the table's main data area (which is called the table's heap in PostgreSQL terminology). This means that in an ordinary index scan, each row retrieval requires fetching data from both the index and the heap.