为 Postgres 分区创建自定义哈希运算符
Create custom hash operator for Postgres partitioning
我想创建一个自定义哈希函数,Postgres(版本 13.2)将使用该函数跨分区分布行。问题是目前的解决方案 Postgres 不使用分区修剪。
这是我的代码:
-- dummy hash function
CREATE OR REPLACE FUNCTION partition_custom_bigint_hash(value BIGINT, seed
BIGINT)
RETURNS BIGINT AS $$
SELECT value;
$$ LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
-- operator
CREATE OPERATOR CLASS partition_custom_bigint_hash_op
FOR TYPE int8
USING hash AS
OPERATOR 1 =,
FUNCTION 2 partition_custom_bigint_hash(BIGINT, BIGINT);
-- table partitioned by hash with custom operator
CREATE TABLE sample(part_id BIGINT) PARTITION BY hash(part_id partition_custom_bigint_hash_op);
CREATE TABLE sample_part_1 PARTITION OF SAMPLE FOR VALUES WITH (modulus 3, remainder 0);
CREATE TABLE sample_part_2 PARTITION OF SAMPLE FOR VALUES WITH (modulus 3, remainder 1);
CREATE TABLE sample_part_3 PARTITION OF SAMPLE FOR VALUES WITH (modulus 3, remainder 2);
现在确保分区修剪已启用并正常工作:
SHOW enable_partition_pruning;
-- enable_partition_pruning
-- --------------------------
-- on
EXPLAIN * FROM sample WHERE part_id = 1::BIGINT;
-- QUERY PLAN
-- ----------------------------------------------------------------------
-- Seq Scan on sample_part_1 sample (cost=0.00..38.25 rows=11 width=8)
-- Filter: (part_id = '1'::bigint)
-- (2 rows)
所以它在使用条件 part_id=1::BIGINT
时工作正常但是如果我跳过强制转换为 BIGINT 我得到:
EXPLAIN SELECT * FROM sample WHERE part_id = 1;
-- QUERY PLAN
-- ------------------------------------------------------------------------------
-- Append (cost=0.00..101.36 rows=33 width=8)
-- -> Seq Scan on sample_part_1 sample_1 (cost=0.00..33.73 rows=11 width=8)
-- Filter: (part_id = 1)
-- -> Seq Scan on sample_part_2 sample_2 (cost=0.00..33.73 rows=11 width=8)
-- Filter: (part_id = 1)
-- -> Seq Scan on sample_part_3 sample_3 (cost=0.00..33.73 rows=11 width=8)
-- Filter: (part_id = 1)
问题:为了使分区修剪在 part_id=1
和 part_id=1::BIGINT
两种情况下都有效,我需要更改什么?
左边带bigint
的相等运算符有几个:
SELECT oid,
oprcode::regproc AS function,
oprright::regtype AS right_side
FROM pg_operator
WHERE oprname = '='
AND oprleft = 'bigint'::regtype;
oid | function | right_side
------+----------+------------
410 | int8eq | bigint
416 | int84eq | integer
1868 | int82eq | smallint
(3 rows)
现在第二个查询使用这些运算符中的第二个,但该运算符不属于您的自定义运算符系列,因此不会进行分区修剪。
在 src/backend/partitioning/partprune.c
的 match_clause_to_partition_key
中查看此评论:
/*
* See if the operator is relevant to the partitioning opfamily.
*
* Normally we only care about operators that are listed as being part
* of the partitioning operator family. But there is one exception:
* the not-equals operators are not listed in any operator family
* whatsoever, but their negators (equality) are. We can use one of
* those if we find it, but only for list partitioning.
*
* Note: we report NOMATCH on failure, in case a later partkey has the
* same expression but different opfamily. That's unlikely, but not
* much more so than duplicate expressions with different collations.
*/
创建包含所需运算符的运算符系列:
CREATE FUNCTION partition_custom_hash(value int8, seed int8) RETURNS int8
AS 'SELECT value' LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE FUNCTION partition_custom_hash(value int4, seed int4) RETURNS int8
AS 'SELECT value::int8' LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE FUNCTION partition_custom_hash(value int2, seed int2) RETURNS int8
AS 'SELECT value::int8' LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE OPERATOR FAMILY partition_custom_integer_hash_ops USING hash;
CREATE OPERATOR CLASS partition_custom_int8_hash_ops FOR TYPE int8 USING hash
FAMILY partition_custom_integer_hash_ops AS
OPERATOR 1 = (int8, int8),
FUNCTION 2 partition_custom_hash(int8, int8);
CREATE OPERATOR CLASS partition_custom_int4_hash_ops FOR TYPE int4 USING hash
FAMILY partition_custom_integer_hash_ops AS
OPERATOR 1 = (int8, int4),
FUNCTION 2 partition_custom_hash(int4, int4);
CREATE OPERATOR CLASS partition_custom_int2_hash_ops FOR TYPE int2 USING hash
FAMILY partition_custom_integer_hash_ops AS
OPERATOR 1 = (int8, int2),
FUNCTION 2 partition_custom_hash(int2, int2);
那么如果您使用 partition_custom_int8_hash_ops
,它应该可以正常工作。
我想创建一个自定义哈希函数,Postgres(版本 13.2)将使用该函数跨分区分布行。问题是目前的解决方案 Postgres 不使用分区修剪。 这是我的代码:
-- dummy hash function
CREATE OR REPLACE FUNCTION partition_custom_bigint_hash(value BIGINT, seed
BIGINT)
RETURNS BIGINT AS $$
SELECT value;
$$ LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
-- operator
CREATE OPERATOR CLASS partition_custom_bigint_hash_op
FOR TYPE int8
USING hash AS
OPERATOR 1 =,
FUNCTION 2 partition_custom_bigint_hash(BIGINT, BIGINT);
-- table partitioned by hash with custom operator
CREATE TABLE sample(part_id BIGINT) PARTITION BY hash(part_id partition_custom_bigint_hash_op);
CREATE TABLE sample_part_1 PARTITION OF SAMPLE FOR VALUES WITH (modulus 3, remainder 0);
CREATE TABLE sample_part_2 PARTITION OF SAMPLE FOR VALUES WITH (modulus 3, remainder 1);
CREATE TABLE sample_part_3 PARTITION OF SAMPLE FOR VALUES WITH (modulus 3, remainder 2);
现在确保分区修剪已启用并正常工作:
SHOW enable_partition_pruning;
-- enable_partition_pruning
-- --------------------------
-- on
EXPLAIN * FROM sample WHERE part_id = 1::BIGINT;
-- QUERY PLAN
-- ----------------------------------------------------------------------
-- Seq Scan on sample_part_1 sample (cost=0.00..38.25 rows=11 width=8)
-- Filter: (part_id = '1'::bigint)
-- (2 rows)
所以它在使用条件 part_id=1::BIGINT
时工作正常但是如果我跳过强制转换为 BIGINT 我得到:
EXPLAIN SELECT * FROM sample WHERE part_id = 1;
-- QUERY PLAN
-- ------------------------------------------------------------------------------
-- Append (cost=0.00..101.36 rows=33 width=8)
-- -> Seq Scan on sample_part_1 sample_1 (cost=0.00..33.73 rows=11 width=8)
-- Filter: (part_id = 1)
-- -> Seq Scan on sample_part_2 sample_2 (cost=0.00..33.73 rows=11 width=8)
-- Filter: (part_id = 1)
-- -> Seq Scan on sample_part_3 sample_3 (cost=0.00..33.73 rows=11 width=8)
-- Filter: (part_id = 1)
问题:为了使分区修剪在 part_id=1
和 part_id=1::BIGINT
两种情况下都有效,我需要更改什么?
左边带bigint
的相等运算符有几个:
SELECT oid,
oprcode::regproc AS function,
oprright::regtype AS right_side
FROM pg_operator
WHERE oprname = '='
AND oprleft = 'bigint'::regtype;
oid | function | right_side
------+----------+------------
410 | int8eq | bigint
416 | int84eq | integer
1868 | int82eq | smallint
(3 rows)
现在第二个查询使用这些运算符中的第二个,但该运算符不属于您的自定义运算符系列,因此不会进行分区修剪。
在 src/backend/partitioning/partprune.c
的 match_clause_to_partition_key
中查看此评论:
/*
* See if the operator is relevant to the partitioning opfamily.
*
* Normally we only care about operators that are listed as being part
* of the partitioning operator family. But there is one exception:
* the not-equals operators are not listed in any operator family
* whatsoever, but their negators (equality) are. We can use one of
* those if we find it, but only for list partitioning.
*
* Note: we report NOMATCH on failure, in case a later partkey has the
* same expression but different opfamily. That's unlikely, but not
* much more so than duplicate expressions with different collations.
*/
创建包含所需运算符的运算符系列:
CREATE FUNCTION partition_custom_hash(value int8, seed int8) RETURNS int8
AS 'SELECT value' LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE FUNCTION partition_custom_hash(value int4, seed int4) RETURNS int8
AS 'SELECT value::int8' LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE FUNCTION partition_custom_hash(value int2, seed int2) RETURNS int8
AS 'SELECT value::int8' LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE OPERATOR FAMILY partition_custom_integer_hash_ops USING hash;
CREATE OPERATOR CLASS partition_custom_int8_hash_ops FOR TYPE int8 USING hash
FAMILY partition_custom_integer_hash_ops AS
OPERATOR 1 = (int8, int8),
FUNCTION 2 partition_custom_hash(int8, int8);
CREATE OPERATOR CLASS partition_custom_int4_hash_ops FOR TYPE int4 USING hash
FAMILY partition_custom_integer_hash_ops AS
OPERATOR 1 = (int8, int4),
FUNCTION 2 partition_custom_hash(int4, int4);
CREATE OPERATOR CLASS partition_custom_int2_hash_ops FOR TYPE int2 USING hash
FAMILY partition_custom_integer_hash_ops AS
OPERATOR 1 = (int8, int2),
FUNCTION 2 partition_custom_hash(int2, int2);
那么如果您使用 partition_custom_int8_hash_ops
,它应该可以正常工作。