在按日期分区的 table 上使用具有唯一约束的 'on conflict'
Using 'on conflict' with a unique constraint on a table partitioned by date
给出以下 table:
CREATE TABLE event_partitioned (
customer_id varchar(50) NOT NULL,
user_id varchar(50) NOT NULL,
event_id varchar(50) NOT NULL,
comment varchar(50) NOT NULL,
event_timestamp timestamp with time zone DEFAULT NOW()
)
PARTITION BY RANGE (event_timestamp);
并按日历周划分[一个例子]:
CREATE TABLE event_partitioned_2020_51 PARTITION OF event_partitioned
FOR VALUES FROM ('2020-12-14') TO ('2020-12-20');
并且唯一约束[event_timestamp因为分区键是必要的]:
ALTER TABLE event_partitioned
ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);
如果customer_id、user_id、event_id存在我想更新,否则插入:
INSERT INTO event_partitioned (customer_id, user_id, event_id)
VALUES ('9', '99', '999')
ON CONFLICT (customer_id, user_id, event_id, event_timestamp) DO UPDATE
SET comment = 'I got updated';
但我不能只为 customer_id、user_id、event_id 添加唯一约束,因此也 event_timestamp。
所以这将插入 customer_id、user_id、event_id 的副本。即使添加 now() 作为第四个值,除非 now() 精确匹配 event_timestamp.
中已有的内容
有没有一种方法可以减少 ON CONFLICT 'granular' 并在 now() 落在分区的那一周,而不是恰好在 '2020-12-14 09:13:04例如.543256'?
基本上,我试图至少在一周内避免 customer_id、user_id、event_id 的重复,但仍然受益于按周划分(以便数据检索可以缩小到一个日期范围而不是扫描整个分区 table).
我认为您无法在分区 table 中使用 on conflict
执行此操作。但是,您可以使用 CTE 表达逻辑:
with
data as ( -- data
select '9' as customer_id, '99' as user_id, '999' as event_id
),
ins as ( -- insert if not exists
insert into event_partitioned (customer_id, user_id, event_id)
select * from data d
where not exists (
select 1
from event_partitioned ep
where
ep.customer_id = d.customer_id
and ep.user_id = d.user_id
and ep.event_id = d.event_id
)
returning *
)
update event_partitioned ep -- update if insert did not happen
set comment = 'I got updated'
from data d
where
ep.customer_id = d.customer_id
and ep.user_id = d.user_id
and ep.event_id = d.event_id
and not exists (select 1 from ins)
@GMB 的回答很好,效果很好。由于对按时间范围分区的分区 table(父 table)强制执行唯一约束通常不是那么有用,为什么现在只在分区本身上放置唯一 constraint/index?
在您的情况下,event_partitioned_2020_51 可以有一个唯一约束:
ALTER TABLE event_partitioned_2020_51
ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);
而后续查询可以只用
INSERT ... INTO event_partitioned_2020_51 ON CONFLICT (customer_id, user_id, event_id, event_timestamp)
只要这是预期的分区,通常就是这种情况。
给出以下 table:
CREATE TABLE event_partitioned (
customer_id varchar(50) NOT NULL,
user_id varchar(50) NOT NULL,
event_id varchar(50) NOT NULL,
comment varchar(50) NOT NULL,
event_timestamp timestamp with time zone DEFAULT NOW()
)
PARTITION BY RANGE (event_timestamp);
并按日历周划分[一个例子]:
CREATE TABLE event_partitioned_2020_51 PARTITION OF event_partitioned
FOR VALUES FROM ('2020-12-14') TO ('2020-12-20');
并且唯一约束[event_timestamp因为分区键是必要的]:
ALTER TABLE event_partitioned
ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);
如果customer_id、user_id、event_id存在我想更新,否则插入:
INSERT INTO event_partitioned (customer_id, user_id, event_id)
VALUES ('9', '99', '999')
ON CONFLICT (customer_id, user_id, event_id, event_timestamp) DO UPDATE
SET comment = 'I got updated';
但我不能只为 customer_id、user_id、event_id 添加唯一约束,因此也 event_timestamp。
所以这将插入 customer_id、user_id、event_id 的副本。即使添加 now() 作为第四个值,除非 now() 精确匹配 event_timestamp.
中已有的内容有没有一种方法可以减少 ON CONFLICT 'granular' 并在 now() 落在分区的那一周,而不是恰好在 '2020-12-14 09:13:04例如.543256'?
基本上,我试图至少在一周内避免 customer_id、user_id、event_id 的重复,但仍然受益于按周划分(以便数据检索可以缩小到一个日期范围而不是扫描整个分区 table).
我认为您无法在分区 table 中使用 on conflict
执行此操作。但是,您可以使用 CTE 表达逻辑:
with
data as ( -- data
select '9' as customer_id, '99' as user_id, '999' as event_id
),
ins as ( -- insert if not exists
insert into event_partitioned (customer_id, user_id, event_id)
select * from data d
where not exists (
select 1
from event_partitioned ep
where
ep.customer_id = d.customer_id
and ep.user_id = d.user_id
and ep.event_id = d.event_id
)
returning *
)
update event_partitioned ep -- update if insert did not happen
set comment = 'I got updated'
from data d
where
ep.customer_id = d.customer_id
and ep.user_id = d.user_id
and ep.event_id = d.event_id
and not exists (select 1 from ins)
@GMB 的回答很好,效果很好。由于对按时间范围分区的分区 table(父 table)强制执行唯一约束通常不是那么有用,为什么现在只在分区本身上放置唯一 constraint/index?
在您的情况下,event_partitioned_2020_51 可以有一个唯一约束:
ALTER TABLE event_partitioned_2020_51
ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);
而后续查询可以只用
INSERT ... INTO event_partitioned_2020_51 ON CONFLICT (customer_id, user_id, event_id, event_timestamp)
只要这是预期的分区,通常就是这种情况。