netezza 删除具有不同时间戳字段但特定字段相同的记录

netezza delete records with different timestamp field where specific fields are the same

我有一个 netezza table,更新时数据可能会重叠,但是每个关联的 timestamp 字段会有所不同。例如:

+-----------------+----------+-------------+-----+
|       ts        | first_nm |   last_nm   | val |
+-----------------+----------+-------------+-----+
| 4/1/2015 4:15pm | ben      | bloomington | 12  |
| 4/1/2015 4:20pm | ben      | bloomington | 4.5 |
| 4/1/2015 4:20pm | andrew   | bloomberg   | 2.8 |
+-----------------+----------+-------------+-----+

我想保留以下记录并删除 ben bloomington 较早的时间戳:

+-----------------+----------+-------------+-----+
|       ts        | first_nm |   last_nm   | val |
+-----------------+----------+-------------+-----+
| 4/1/2015 4:20pm | ben      | bloomington | 4.5 |
| 4/1/2015 4:20pm | andrew   | bloomberg   | 2.8 |
+-----------------+----------+-------------+-----+

所以,基于 first_nmlast_nm 是不同的,我怎样才能保持最新的 ts 和最新的值?

我想我可以使用 row_number() 函数,但我不确定如何在我的 delete 语句中实现它。

您可以使用以下示例删除所有不是最新时间戳的行。我添加了窗口函数 row_number() 作为示例。

delete from <table>
where rowid in
   (
select rwid
from (  select rowid as rwid
        , row_number() over(partition by first_nm,last_nm order by ts desc) as rown
        from <table>
     ) sub
where sub.rown>1
   );

应该做同样的较短的解决方案是:

 DELETE FROM table t
 WHERE 
   EXISTS (SELECT * FROM table 
           WHERE t.rwid < rwid 
             AND t.first_nm = first_nm 
             AND t.last_nm = last_nm)