使用自增索引时避免插入重复项

Question

我有一个问题：

INSERT INTO tweet_hashtags(hashtag_id, tweet_id)
VALUES(1, 1) 
ON CONFLICT DO NOTHING 
RETURNING id

工作正常并使用 id = 1 插入，但是当存在重复项时，假设另一个 (1, 1) 它使用 id = 2 插入。我想防止这种情况发生，我读到我可以做到 ON CONFLICT (col_name) 但这并没有真正帮助，因为我需要一次检查两个值。

Answer 1

on conflict 子句需要对您希望唯一的列集具有唯一约束或索引 - 看起来您没有。

创建时可以设置table table:

create table tweet_hashtags(
    id serial primary key, 
    hashtag_id int, 
    tweet_id int, 
    unique (hashtag_id, tweet_id)
);

或者，如果 table 已经存在，你可以创建一个唯一索引（但你需要先去掉重复项）：

create unique index idx_tweet_hashtags on tweet_hashtags(hashtag_id, tweet_id);

那么您的查询应该可以正常工作：

insert into tweet_hashtags(hashtag_id, tweet_id)
values(1, 1) 
on conflict (hashtag_id, tweet_id) do nothing 
returning id

指定冲突目标使意图更清晰，通常应该是首选（尽管 do nothing 不是强制性的）。

注意，跳过插入时查询returns什么都没有（即不返回已有的id）。

这里有一个 demo on DB Fiddle 演示了使用和不使用唯一索引时的行为。

Avoid inserting duplicates when using autoincrementing index