在按日期分区的 table 上使用具有唯一约束的 'on conflict'

Using 'on conflict' with a unique constraint on a table partitioned by date

给出以下 table:

CREATE TABLE event_partitioned (
    customer_id varchar(50) NOT NULL,
    user_id varchar(50) NOT NULL,
    event_id varchar(50) NOT NULL,
    comment varchar(50) NOT NULL,
    event_timestamp timestamp with time zone DEFAULT NOW()
)
PARTITION BY RANGE (event_timestamp);

并按日历周划分[一个例子]:

CREATE TABLE event_partitioned_2020_51 PARTITION OF event_partitioned
FOR VALUES FROM ('2020-12-14') TO ('2020-12-20');

并且唯一约束[event_timestamp因为分区键是必要的]:

ALTER TABLE event_partitioned
    ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);

如果customer_id、user_id、event_id存在我想更新,否则插入:

INSERT INTO event_partitioned (customer_id, user_id, event_id)
VALUES ('9', '99', '999')
ON CONFLICT (customer_id, user_id, event_id, event_timestamp) DO UPDATE
SET comment = 'I got updated';

但我不能只为 customer_id、user_id、event_id 添加唯一约束,因此也 event_timestamp。

所以这将插入 customer_id、user_id、event_id 的副本。即使添加 now() 作为第四个值,除非 now() 精确匹配 event_timestamp.

中已有的内容

有没有一种方法可以减少 ON CONFLICT 'granular' 并在 now() 落在分区的那一周,而不是恰好在 '2020-12-14 09:13:04例如.543256'?

基本上,我试图至少在一周内避免 customer_id、user_id、event_id 的重复,但仍然受益于按周划分(以便数据检索可以缩小到一个日期范围而不是扫描整个分区 table).

我认为您无法在分区 table 中使用 on conflict 执行此操作。但是,您可以使用 CTE 表达逻辑:

with 
    data as ( -- data
        select '9' as customer_id, '99' as user_id, '999' as event_id
    ),
    ins as (  -- insert if not exists
        insert into event_partitioned (customer_id, user_id, event_id)
        select * from data d
        where not exists (
            select 1 
            from event_partitioned ep
            where 
                ep.customer_id = d.customer_id
                and ep.user_id = d.user_id
                and ep.event_id = d.event_id
        )
        returning *
    )
update event_partitioned ep  -- update if insert did not happen
set comment = 'I got updated'
from data d
where 
    ep.customer_id = d.customer_id
    and ep.user_id = d.user_id
    and ep.event_id = d.event_id
    and not exists (select 1 from ins)
    

@GMB 的回答很好,效果很好。由于对按时间范围分区的分区 table(父 table)强制执行唯一约束通常不是那么有用,为什么现在只在分区本身上放置唯一 constraint/index?

在您的情况下,event_partitioned_2020_51 可以有一个唯一约束:

ALTER TABLE event_partitioned_2020_51
    ADD UNIQUE (customer_id, user_id, event_id, event_timestamp);

而后续查询可以只用

INSERT ... INTO event_partitioned_2020_51 ON CONFLICT (customer_id, user_id, event_id, event_timestamp)

只要这是预期的分区,通常就是这种情况。