Postgres - 删除重复行以确保唯一索引有效

Question

我有一个 table 结构如下：

id | foodid | ingredientid

我想创建一个唯一索引如下：

create unique index foodingredient_foodid_ingredientid_uindex
    on foodingredient (foodid, ingredientid);

问题是 table 包含大量重复的 foodid 和 ingredientid 条目。这些是不需要的，我想删除它们。

如果我运行:

select count(*)
from foodingredient
group by foodid, ingredientid
having count(*) > 1
order by count desc

这 returns 半百万行。所以手动修复这些不是一个选项。

所以我想做的是删除所有重复项，同时保留原始内容。

即

id | foodid | ingredientid
1  | 144    | 531
2  | 144    | 531
3  | 144    | 531
4  | 144    | 531

变为：

id | foodid | ingredientid
1  | 144    | 531

有没有办法通过查询来做到这一点？

Answer 1

你可以用 exists:

delete from foodingredient t
where exists (
  select 1 from foodingredient
  where foodid = t.foodid and ingredientid = t.ingredientid and id < t.id
)

参见demo。

Answer 2

DELETE FROM foodingredient a
USING foodingredient b
WHERE a.id > b.id
    AND a.foodid = b.foodid 
    AND a.ingredientid = b.ingredientid;

Postgres - 删除重复行以确保唯一索引有效

Postgres - Deleting duplicates rows to ensure unique index works

sql

postgresql

duplicates

postgresql-9.5