在 Postgres 12 或 13 中对 UUID 进行分区
Partitioning on a UUID in Postgres 12 or 13
问题
有人要求我在新 table 中将大量数据复制到 Postgres。数据包含装配组件列表,在下面的 table 定义中进行了简化:
CREATE TABLE IF NOT EXISTS assembly_item (
id uuid NOT NULL DEFAULT NULL,
assembly_id. uuid. NOT NULL DEFAULT NULL,
done_dts timestamp NOT NULL DEFAULT 'epoch',
CONSTRAINT assembly_item_pk
PRIMARY KEY (id)
);
原来有几十个属性,现在有几亿行。这些记录分布在多个安装中,并不存储在本地的 Postgres 中。在此 table 上的插入加起来很快,并且它会在一年内增长到 1B 行,这是我的猜测。日期很少更新,也从不删除。 (它可能会及时发生,但不会经常发生。)相同的 id
永远不会 与不同的 assembly_id
值重复。因此,在 id
上的分区级别是唯一的是安全的。这里的目标是将这些数据卸载到 Postgres 上,并只将最近的数据保留在本地服务器的缓存中。
这看起来很适合分区,我正在寻找一些关于合理策略的指导。您可以从简化的结构中看到,我们有一个唯一行 id
、一个父行 assembly_id
和一个时间戳。我查看了原始数据库中的现有查询,主要搜索字段是 assembly_id
,父记录标识符。 assembly
和 assembly_item
之间的基数约为 1:200.
为了使分区发挥最大作用,似乎需要根据使查询规划器能够智能地删除分区的值来拆分数据。我想到了一些想法,但还没有 200M 行可以再次测试。与此同时,我正在考虑的是:
使用 RANGE
或 done_dts
的 YYYY-MM
上的 LIST
按月分区。按日期范围重写所有查询。
按 HASH
对 assembly_id::text
的前两个字符进行分区,得到 256 个大小相当的分区。我认为这让我们可以搜索 assembly_id
并删除许多没有匹配项的分区,但是当我设置它时它看起来很奇怪。
我很感激我问了一个有点推测性的问题,我希望这里有一些可能使我的第一次尝试更成功的指示。一旦我有了位数据集,我就可以更直接地进行实验。
我已经包含了实验设置代码,为了简洁起见,只列出了分区的示例。
使用 LIST
分区的示例设置
------------------------------------
-- Define table partitioned by list
------------------------------------
-- Could alternatively use RANGE here to partition by month.
BEGIN;
-- Drop parent table, if they exists.
-- This destroys ALL partitions automatically, even without a CASCADE clause.
DROP TABLE IF EXISTS assembly_item_list CASCADE;
CREATE TABLE IF NOT EXISTS assembly_item_list (
id uuid NOT NULL DEFAULT NULL,
assembly_id uuid NOT NULL DEFAULT NULL,
assembly_done_dts timestamp NOT NULL DEFAULT 'epoch', -- Copied in from assembly.done_dts when rows are pushed to Postgres.
year_and_month citext NOT NULL DEFAULT NULL, -- YYYY-MM from assembly_done_dts, calculated in insert function. Can't use a generated column as a partition key.
-- Reminder: id values come from the various source tables in IB. The upsert writes over matches ON CONFLICT with this ID.
-- Note: You *must* include the partition key in the primary key. It's a rule.
CONSTRAINT assembly_item_list_pk
PRIMARY KEY (year_and_month, id)
) PARTITION BY LIST (year_and_month);
-- Previous year partitions built here...
-- Build out 2021 completely.
CREATE TABLE assembly_item_list_2021_01 partition of assembly_item_list HASH (assembly_id) ('2021-01');
CREATE TABLE assembly_item_list_2021_02 partition of assembly_item_list HASH (assembly_id) ('2021-02');
-- etc.
-- In case I screw up at the end of the year....
CREATE TABLE assembly_item_list_default partition of assembly_item_list default;
COMMIT;
使用 HASH
分区的示例设置。
------------------------------------
-- Define table partitioned by hash
------------------------------------
BEGIN;
-- Drop parent table, if they exists.
-- This destroys ALL partitions automatically, even without a CASCADE clause.
DROP TABLE IF EXISTS assembly_item_hash CASCADE;
CREATE TABLE IF NOT EXISTS assembly_item_hash (
id uuid NOT NULL DEFAULT NULL,
assembly_id uuid NOT NULL DEFAULT NULL,
assembly_done_dts timestamp NOT NULL DEFAULT 'epoch', -- Copied in from assembly.done_dts when rows are pushed to Postgres.
partition_key text NOT NULL DEFAULT NULL, -- '00', '0A', etc. Populated in a BEFORE INSERT trigger on the partition. Can't use a generated column as a partition key, can't use a column reference in DEFAULT.
-- Reminder: id values come from the various source tables in IB. The upsert writes over matches ON CONFLICT with this ID.
-- Note: You *must* include the partition key in the primary key. It's a rule.
CONSTRAINT assembly_item_hash_pk
PRIMARY KEY (partition_key, id)
) PARTITION BY HASH (partition_key);
-----------------------------------------------------
-- Create trigger function to populate partition_key
-----------------------------------------------------
-- The partition key is a two-character hex string, like '00', '3E', and so on.
CREATE OR REPLACE FUNCTION set_partition_key()
RETURNS TRIGGER AS $$
BEGIN
NEW.partition_key = UPPER(LEFT(NEW.assembly_id, 2));
RETURN NEW;
END;
$$ language plpgsql IMMUTABLE; -- I don't think that I need to worry about IMMUTABLE here. 01234567890ABCDEF shouldn't break.
-----------------------------------------------------
-- Build partitions
-----------------------------------------------------
-- Note: Have to assign triggers to partitions individually.
-- Seems that it would be easier to add the logic to my central insert function.
CREATE TABLE assembly_item_hash_00 partition of assembly_item_hash FOR VALUES WITH (modulus 256, remainder 0);
CREATE TRIGGER set_partition_key_trigger_00
BEFORE INSERT OR UPDATE ON assembly_item_hash_00
FOR EACH ROW
EXECUTE PROCEDURE set_partition_key();
CREATE TABLE assembly_item_hash_01 partition of assembly_item_hash FOR VALUES WITH (modulus 256, remainder 1);
CREATE TRIGGER set_partition_key_trigger_01
BEFORE INSERT OR UPDATE ON assembly_item_hash_01
FOR EACH ROW
EXECUTE PROCEDURE set_partition_key();
-- And so on for all 256 partitions.
COMMIT;
有什么建议吗?真的,有什么想到的吗?
我不能说日期或 UUID 哈希是更好的分区键。
但我可以这样说:你的任何一个解决方案都可以更有效。
哈希分区基于uuid
您添加分区键列并使用触发器函数填充它的计划效率很低。而且没必要。 (触发器功能本身的问题放在一边。)
好像有误会。您有评论:
-- Note: You must include the partition key in the primary key. It's a rule.
不完全是。 The manual:
Unique constraints (and hence primary keys) on partitioned tables must
include all the partition key columns. This limitation exists because
the individual indexes making up the constraint can only directly
enforce uniqueness within their own partitions; therefore, the
partition structure itself must guarantee that there are not
duplicates in different partitions.
分区键列。不是分区键。
在 (assembly_id)
上使用散列分区的设置与同一列上的 PK 一起使用。像这样:
CREATE TABLE IF NOT EXISTS assembly_item_hash (
assembly_id uuid NOT NULL
, id uuid NOT NULL
, assembly_done_dts timestamp NOT NULL DEFAULT 'epoch'
, PRIMARY KEY (assembly_id, id)
) PARTITION BY HASH (assembly_id);
CREATE TABLE assembly_item_hash_000 PARTITION OF assembly_item_hash FOR VALUES WITH (MODULUS 256, REMAINDER 0);
CREATE TABLE assembly_item_hash_001 PARTITION OF assembly_item_hash FOR VALUES WITH (MODULUS 256, REMAINDER 1);
-- etc.
简单很多。
唯一不足:PK索引较大,uuid
占用16字节
如果这是一个问题,您可能会回退到您想到的生成的 partition_key
。每个分区有一个触发器。 (呃,开销很大!)但将列设为 integer
而不是 text
,并使用 much 更高效的内置哈希函数 uuid_hash()
.这是内部用于散列分区的函数。但是现在我们明确地使用它并进行 LIST
分区:
CREATE TABLE IF NOT EXISTS assembly_item_hash (
id uuid NOT NULL
, assembly_id uuid NOT NULL
, partition_key int4 NOT NULL
, assembly_done_dts timestamp NOT NULL DEFAULT 'epoch'
, PRIMARY KEY (partition_key, id)
) PARTITION BY LIST (partition_key);
向每个 table 行添加 4 个字节,从每个索引项中节省 12 个字节 - 理论上。由于对齐填充,您在 table 和索引中又丢失了 4 个字节,最终磁盘上的总数 space 与以前相同(大致 - table 和索引膨胀可能不同)。
除非 "column tetris" 允许您更有效地适应该列,以赢得每行总共 8 个字节...请参阅:
- Calculating and saving space in PostgreSQL
列表分区基于timestamp
不要使用 citext
。不必要的并发症。
使用整数代替 YYYY-MM。更小,更快。我建议这个基本功能:
CREATE FUNCTION f_yyyymm(timestamp)
RETURNS int
LANGUAGE sql PARALLEL SAFE IMMUTABLE AS
'SELECT (EXTRACT(year FROM ) * 100 + EXTRACT(month FROM ))::int';
参见:
- How do you do date math that ignores the year?
问题
有人要求我在新 table 中将大量数据复制到 Postgres。数据包含装配组件列表,在下面的 table 定义中进行了简化:
CREATE TABLE IF NOT EXISTS assembly_item (
id uuid NOT NULL DEFAULT NULL,
assembly_id. uuid. NOT NULL DEFAULT NULL,
done_dts timestamp NOT NULL DEFAULT 'epoch',
CONSTRAINT assembly_item_pk
PRIMARY KEY (id)
);
原来有几十个属性,现在有几亿行。这些记录分布在多个安装中,并不存储在本地的 Postgres 中。在此 table 上的插入加起来很快,并且它会在一年内增长到 1B 行,这是我的猜测。日期很少更新,也从不删除。 (它可能会及时发生,但不会经常发生。)相同的 id
永远不会 与不同的 assembly_id
值重复。因此,在 id
上的分区级别是唯一的是安全的。这里的目标是将这些数据卸载到 Postgres 上,并只将最近的数据保留在本地服务器的缓存中。
这看起来很适合分区,我正在寻找一些关于合理策略的指导。您可以从简化的结构中看到,我们有一个唯一行 id
、一个父行 assembly_id
和一个时间戳。我查看了原始数据库中的现有查询,主要搜索字段是 assembly_id
,父记录标识符。 assembly
和 assembly_item
之间的基数约为 1:200.
为了使分区发挥最大作用,似乎需要根据使查询规划器能够智能地删除分区的值来拆分数据。我想到了一些想法,但还没有 200M 行可以再次测试。与此同时,我正在考虑的是:
使用
RANGE
或done_dts
的YYYY-MM
上的LIST
按月分区。按日期范围重写所有查询。按
HASH
对assembly_id::text
的前两个字符进行分区,得到 256 个大小相当的分区。我认为这让我们可以搜索assembly_id
并删除许多没有匹配项的分区,但是当我设置它时它看起来很奇怪。
我很感激我问了一个有点推测性的问题,我希望这里有一些可能使我的第一次尝试更成功的指示。一旦我有了位数据集,我就可以更直接地进行实验。
我已经包含了实验设置代码,为了简洁起见,只列出了分区的示例。
使用 LIST
分区的示例设置
------------------------------------
-- Define table partitioned by list
------------------------------------
-- Could alternatively use RANGE here to partition by month.
BEGIN;
-- Drop parent table, if they exists.
-- This destroys ALL partitions automatically, even without a CASCADE clause.
DROP TABLE IF EXISTS assembly_item_list CASCADE;
CREATE TABLE IF NOT EXISTS assembly_item_list (
id uuid NOT NULL DEFAULT NULL,
assembly_id uuid NOT NULL DEFAULT NULL,
assembly_done_dts timestamp NOT NULL DEFAULT 'epoch', -- Copied in from assembly.done_dts when rows are pushed to Postgres.
year_and_month citext NOT NULL DEFAULT NULL, -- YYYY-MM from assembly_done_dts, calculated in insert function. Can't use a generated column as a partition key.
-- Reminder: id values come from the various source tables in IB. The upsert writes over matches ON CONFLICT with this ID.
-- Note: You *must* include the partition key in the primary key. It's a rule.
CONSTRAINT assembly_item_list_pk
PRIMARY KEY (year_and_month, id)
) PARTITION BY LIST (year_and_month);
-- Previous year partitions built here...
-- Build out 2021 completely.
CREATE TABLE assembly_item_list_2021_01 partition of assembly_item_list HASH (assembly_id) ('2021-01');
CREATE TABLE assembly_item_list_2021_02 partition of assembly_item_list HASH (assembly_id) ('2021-02');
-- etc.
-- In case I screw up at the end of the year....
CREATE TABLE assembly_item_list_default partition of assembly_item_list default;
COMMIT;
使用 HASH
分区的示例设置。
------------------------------------
-- Define table partitioned by hash
------------------------------------
BEGIN;
-- Drop parent table, if they exists.
-- This destroys ALL partitions automatically, even without a CASCADE clause.
DROP TABLE IF EXISTS assembly_item_hash CASCADE;
CREATE TABLE IF NOT EXISTS assembly_item_hash (
id uuid NOT NULL DEFAULT NULL,
assembly_id uuid NOT NULL DEFAULT NULL,
assembly_done_dts timestamp NOT NULL DEFAULT 'epoch', -- Copied in from assembly.done_dts when rows are pushed to Postgres.
partition_key text NOT NULL DEFAULT NULL, -- '00', '0A', etc. Populated in a BEFORE INSERT trigger on the partition. Can't use a generated column as a partition key, can't use a column reference in DEFAULT.
-- Reminder: id values come from the various source tables in IB. The upsert writes over matches ON CONFLICT with this ID.
-- Note: You *must* include the partition key in the primary key. It's a rule.
CONSTRAINT assembly_item_hash_pk
PRIMARY KEY (partition_key, id)
) PARTITION BY HASH (partition_key);
-----------------------------------------------------
-- Create trigger function to populate partition_key
-----------------------------------------------------
-- The partition key is a two-character hex string, like '00', '3E', and so on.
CREATE OR REPLACE FUNCTION set_partition_key()
RETURNS TRIGGER AS $$
BEGIN
NEW.partition_key = UPPER(LEFT(NEW.assembly_id, 2));
RETURN NEW;
END;
$$ language plpgsql IMMUTABLE; -- I don't think that I need to worry about IMMUTABLE here. 01234567890ABCDEF shouldn't break.
-----------------------------------------------------
-- Build partitions
-----------------------------------------------------
-- Note: Have to assign triggers to partitions individually.
-- Seems that it would be easier to add the logic to my central insert function.
CREATE TABLE assembly_item_hash_00 partition of assembly_item_hash FOR VALUES WITH (modulus 256, remainder 0);
CREATE TRIGGER set_partition_key_trigger_00
BEFORE INSERT OR UPDATE ON assembly_item_hash_00
FOR EACH ROW
EXECUTE PROCEDURE set_partition_key();
CREATE TABLE assembly_item_hash_01 partition of assembly_item_hash FOR VALUES WITH (modulus 256, remainder 1);
CREATE TRIGGER set_partition_key_trigger_01
BEFORE INSERT OR UPDATE ON assembly_item_hash_01
FOR EACH ROW
EXECUTE PROCEDURE set_partition_key();
-- And so on for all 256 partitions.
COMMIT;
有什么建议吗?真的,有什么想到的吗?
我不能说日期或 UUID 哈希是更好的分区键。 但我可以这样说:你的任何一个解决方案都可以更有效。
哈希分区基于uuid
您添加分区键列并使用触发器函数填充它的计划效率很低。而且没必要。 (触发器功能本身的问题放在一边。)
好像有误会。您有评论:
-- Note: You must include the partition key in the primary key. It's a rule.
不完全是。 The manual:
Unique constraints (and hence primary keys) on partitioned tables must include all the partition key columns. This limitation exists because the individual indexes making up the constraint can only directly enforce uniqueness within their own partitions; therefore, the partition structure itself must guarantee that there are not duplicates in different partitions.
分区键列。不是分区键。
在 (assembly_id)
上使用散列分区的设置与同一列上的 PK 一起使用。像这样:
CREATE TABLE IF NOT EXISTS assembly_item_hash (
assembly_id uuid NOT NULL
, id uuid NOT NULL
, assembly_done_dts timestamp NOT NULL DEFAULT 'epoch'
, PRIMARY KEY (assembly_id, id)
) PARTITION BY HASH (assembly_id);
CREATE TABLE assembly_item_hash_000 PARTITION OF assembly_item_hash FOR VALUES WITH (MODULUS 256, REMAINDER 0);
CREATE TABLE assembly_item_hash_001 PARTITION OF assembly_item_hash FOR VALUES WITH (MODULUS 256, REMAINDER 1);
-- etc.
简单很多。
唯一不足:PK索引较大,uuid
占用16字节
如果这是一个问题,您可能会回退到您想到的生成的 partition_key
。每个分区有一个触发器。 (呃,开销很大!)但将列设为 integer
而不是 text
,并使用 much 更高效的内置哈希函数 uuid_hash()
.这是内部用于散列分区的函数。但是现在我们明确地使用它并进行 LIST
分区:
CREATE TABLE IF NOT EXISTS assembly_item_hash (
id uuid NOT NULL
, assembly_id uuid NOT NULL
, partition_key int4 NOT NULL
, assembly_done_dts timestamp NOT NULL DEFAULT 'epoch'
, PRIMARY KEY (partition_key, id)
) PARTITION BY LIST (partition_key);
向每个 table 行添加 4 个字节,从每个索引项中节省 12 个字节 - 理论上。由于对齐填充,您在 table 和索引中又丢失了 4 个字节,最终磁盘上的总数 space 与以前相同(大致 - table 和索引膨胀可能不同)。 除非 "column tetris" 允许您更有效地适应该列,以赢得每行总共 8 个字节...请参阅:
- Calculating and saving space in PostgreSQL
列表分区基于timestamp
不要使用 citext
。不必要的并发症。
使用整数代替 YYYY-MM。更小,更快。我建议这个基本功能:
CREATE FUNCTION f_yyyymm(timestamp)
RETURNS int
LANGUAGE sql PARALLEL SAFE IMMUTABLE AS
'SELECT (EXTRACT(year FROM ) * 100 + EXTRACT(month FROM ))::int';
参见:
- How do you do date math that ignores the year?