autovacuum 后的索引大小
Index size after autovacuum
美好的一天。我正在阅读与 Vacuum 进程和 Reindex 例程相关的 Postgres 官方文档。有些句子我不清楚,所以我想澄清一下。(版本 12 的 Postgres 文档)
首先。我确实了解 autovacuum 检查 table 是否有死元组,将它们的位置存储在名为“maintenance_work_mem”的特殊内存中,然后当此内存已满时,真空删除所有索引中引用这些位置的相应页面.有关重建索引的文档 says
B-tree index pages that have become completely empty are reclaimed for
re-use. However, there is still a possibility of inefficient use of
space: if all but a few index keys on a page have been deleted, the
page remains allocated
问题是。如果“页面仍然分配”那么这是否意味着 autovacuum 不会 return 物理 space 从索引内的已删除页面到 OS?例如索引占用 1 GB 内存。我从 table 和 运行 vacuum 中删除了除一行以外的所有行。在这种情况下,索引仍会占用 1 Gb 的内存。我说的对吗?
VACUUM 是(但 VACUUM FULL 不是):
select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)
create table t(s text);
CREATE TABLE
insert into t select generate_series(1,300000)::text;
INSERT 0 300000
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
create index on t(s);
CREATE INDEX
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
delete from t where s <> '1';
DELETE 299999
select count(*) from t;
count
-------
1
(1 row)
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
vacuum t;
VACUUM
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
48 kB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
vacuum full t;
VACUUM
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
16 kB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
16 kB
(1 row)
REINDEX 没有:
select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)
create table t(s text);
CREATE TABLE
insert into t select generate_series(1,300000)::text;
INSERT 0 300000
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
create index on t(s);
CREATE INDEX
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
delete from t where s <> '1';
DELETE 299999
select count(*) from t;
count
-------
1
(1 row)
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
reindex table t;
REINDEX
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
16 kB
(1 row)
src/backend/access/nbtree
中的 README
对此有很多深入的信息。此答案中的引用来自那里。
如果您真的删除了 table 中除一行以外的所有行,索引中的几乎所有页面都会被删除。
We consider deleting an entire page from the btree only when it's become
completely empty of items. (Merging partly-full pages would allow better
space reuse, but it seems impractical to move existing data items left or
right to make this happen --- a scan moving in the opposite direction
might miss the items if so.) Also, we never delete the rightmost page
on a tree level (this restriction simplifies the traversal algorithms, as
explained below). Page deletion always begins from an empty leaf page. An
internal page can only be deleted as part of deleting an entire subtree.
This is always a "skinny" subtree consisting of a "chain" of internal pages
plus a single leaf page. There is one page on each level of the subtree,
and each level/page covers the same key space.
space没有发布到操作系统,但是:
Reclaiming a page doesn't actually change its state on disk --- we simply
record it in the shared-memory free space map, from which it will be
handed out the next time a new page is needed for a page split.
树会变得“瘦”,因为索引的深度永远不会缩小。 PostgreSQL 对此进行了优化:
Because we never delete the rightmost page of any level (and in particular
never delete the root), it's impossible for the height of the tree to
decrease. After massive deletions we might have a scenario in which the
tree is "skinny", with several single-page levels below the root.
Operations will still be correct in this case, but we'd waste cycles
descending through the single-page levels. To handle this we use an idea
from Lanin and Shasha: we keep track of the "fast root" level, which is
the lowest single-page level. The meta-data page keeps a pointer to this
level as well as the true root. All ordinary operations initiate their
searches at the fast root not the true root.
如果您运行 REINDEX INDEX
索引或VACUUM (FULL)
table,索引将被重建,并且space 将被释放。
美好的一天。我正在阅读与 Vacuum 进程和 Reindex 例程相关的 Postgres 官方文档。有些句子我不清楚,所以我想澄清一下。(版本 12 的 Postgres 文档)
首先。我确实了解 autovacuum 检查 table 是否有死元组,将它们的位置存储在名为“maintenance_work_mem”的特殊内存中,然后当此内存已满时,真空删除所有索引中引用这些位置的相应页面.有关重建索引的文档 says
B-tree index pages that have become completely empty are reclaimed for re-use. However, there is still a possibility of inefficient use of space: if all but a few index keys on a page have been deleted, the page remains allocated
问题是。如果“页面仍然分配”那么这是否意味着 autovacuum 不会 return 物理 space 从索引内的已删除页面到 OS?例如索引占用 1 GB 内存。我从 table 和 运行 vacuum 中删除了除一行以外的所有行。在这种情况下,索引仍会占用 1 Gb 的内存。我说的对吗?
VACUUM 是(但 VACUUM FULL 不是):
select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)
create table t(s text);
CREATE TABLE
insert into t select generate_series(1,300000)::text;
INSERT 0 300000
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
create index on t(s);
CREATE INDEX
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
delete from t where s <> '1';
DELETE 299999
select count(*) from t;
count
-------
1
(1 row)
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
vacuum t;
VACUUM
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
48 kB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
vacuum full t;
VACUUM
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
16 kB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
16 kB
(1 row)
REINDEX 没有:
select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)
create table t(s text);
CREATE TABLE
insert into t select generate_series(1,300000)::text;
INSERT 0 300000
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
create index on t(s);
CREATE INDEX
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
delete from t where s <> '1';
DELETE 299999
select count(*) from t;
count
-------
1
(1 row)
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
6600 kB
(1 row)
reindex table t;
REINDEX
select pg_size_pretty(pg_table_size('t'));
pg_size_pretty
----------------
10 MB
(1 row)
select pg_size_pretty(pg_indexes_size('t'));
pg_size_pretty
----------------
16 kB
(1 row)
src/backend/access/nbtree
中的 README
对此有很多深入的信息。此答案中的引用来自那里。
如果您真的删除了 table 中除一行以外的所有行,索引中的几乎所有页面都会被删除。
We consider deleting an entire page from the btree only when it's become completely empty of items. (Merging partly-full pages would allow better space reuse, but it seems impractical to move existing data items left or right to make this happen --- a scan moving in the opposite direction might miss the items if so.) Also, we never delete the rightmost page on a tree level (this restriction simplifies the traversal algorithms, as explained below). Page deletion always begins from an empty leaf page. An internal page can only be deleted as part of deleting an entire subtree. This is always a "skinny" subtree consisting of a "chain" of internal pages plus a single leaf page. There is one page on each level of the subtree, and each level/page covers the same key space.
space没有发布到操作系统,但是:
Reclaiming a page doesn't actually change its state on disk --- we simply record it in the shared-memory free space map, from which it will be handed out the next time a new page is needed for a page split.
树会变得“瘦”,因为索引的深度永远不会缩小。 PostgreSQL 对此进行了优化:
Because we never delete the rightmost page of any level (and in particular never delete the root), it's impossible for the height of the tree to decrease. After massive deletions we might have a scenario in which the tree is "skinny", with several single-page levels below the root. Operations will still be correct in this case, but we'd waste cycles descending through the single-page levels. To handle this we use an idea from Lanin and Shasha: we keep track of the "fast root" level, which is the lowest single-page level. The meta-data page keeps a pointer to this level as well as the true root. All ordinary operations initiate their searches at the fast root not the true root.
如果您运行 REINDEX INDEX
索引或VACUUM (FULL)
table,索引将被重建,并且space 将被释放。