如何从 postgressql sql table 中删除重复的行

How to drop duplicate rows from postgresql sql table


date        | window  | points  |    actual_bool      |         previous_bool          |       creation_time        | source 
------------+---------+---------+---------------------+---------------------------------+----------------------------+--------
 2021-02-11 |     110 |     0.6 |                   0 |                               0 | 2021-02-14 09:20:57.51966  | bldgh
 2021-02-11 |     150 |     0.7 |                   1 |                               0 | 2021-02-14 09:20:57.51966  | fiata
 2021-02-11 |     110 |     0.7 |                   1 |                               0 | 2021-02-14 09:20:57.51966  | nfiws
 2021-02-11 |     150 |     0.7 |                   1 |                               0 | 2021-02-14 09:20:57.51966  | fiata
 2021-02-11 |     110 |     0.6 |                   0 |                               0 | 2021-02-14 09:20:57.51966  | bldgh
 2021-02-11 |     110 |     0.3 |                   0 |                               1 | 2021-02-14 09:22:22.969014 | asdg1
 2021-02-11 |     110 |     0.6 |                   0 |                               0 | 2021-02-14 09:22:22.969014 | j
 2021-02-11 |     110 |     0.3 |                   0 |                               1 | 2021-02-14 09:22:22.969014 | aba
 2021-02-11 |     110 |     0.5 |                   0 |                               1 | 2021-02-14 09:22:22.969014 | fg
 2021-02-11 |     110 |     0.6 |                   1 |                               0 | 2021-02-14 09:22:22.969014 | wdda
 2021-02-11 |     110 |     0.7 |                   1 |                               1 | 2021-02-14 09:23:21.977685 | dda
 2021-02-11 |     110 |     0.5 |                   1 |                               0 | 2021-02-14 09:23:21.977685 | dd
 2021-02-11 |     110 |     0.6 |                   1 |                               1 | 2021-02-14 09:23:21.977685 | so
 2021-02-11 |     110 |     0.5 |                   1 |                               1 | 2021-02-14 09:23:21.977685 | dar
 2021-02-11 |     110 |     0.6 |                   1 |                               1 | 2021-02-14 09:23:21.977685 | firr
 2021-02-11 |     110 |     0.8 |                   1 |                               1 | 2021-02-14 09:24:15.831411 | xim
 2021-02-11 |     110 |     0.8 |                   1 |                               1 | 2021-02-14 09:24:15.831411 | cxyy
 2021-02-11 |     110 |     0.3 |                   0 |                               1 | 2021-02-14 09:24:15.831411 | bisd
 2021-02-11 |     110 |     0.1 |                   0 |                               1 | 2021-02-14 09:24:15.831411 | cope
 2021-02-11 |     110 |     0.2 |                   0 |                               1 | 2021-02-14 09:24:15.831411 | sand
 ...

我在 postgresql table 中有以下数据集,在 testdb 中称为 testtable。

我不小心复制了数据库并复制了行。

如何删除重复项?

第 1 行和第 5 行是此帧中的副本,第 2 行和第 4 行也是副本。

我以前从未使用过 sql 删除重复项我不知道从哪里开始。

我试过了

select creation_time, count(creation_time) from classification group by creation_time having count (creation_time)>1 order by source;

但它所做的只是告诉我我每天有多少次重复,

像这样

       creation_time        | count 
----------------------------+-------
 2021-02-14 09:20:57.51966  |    10
 2021-02-14 09:22:22.969014 |    10
 2021-02-14 09:23:21.977685 |    10
 2021-02-14 09:24:15.831411 |    10
 2021-02-14 09:24:27.733763 |    10
 2021-02-14 09:24:38.41793  |    10
 2021-02-14 09:27:04.432466 |    10
 2021-02-14 09:27:21.62256  |    10
 2021-02-14 09:27:22.677763 |    10
 2021-02-14 09:27:37.996054 |    10
 2021-02-14 09:28:09.275041 |    10
 2021-02-14 09:28:22.649391 |    10
...

每个 creation_timestamp 中应该只有 5 个唯一记录。

它没有向我显示重复项,即使我显示了它也不知道如何删除它们。

要删除的行很多。我建议重新创建 table:

create table new_classification as
    select distinct c.*
    from classification c;

验证数据后,如果确实需要,可以重新加载:

truncate table classification;

insert into classification
    select *
    from new_classification;

这个过程应该比删除 90% 的行要快得多。