netezza 删除具有不同时间戳字段但特定字段相同的记录
netezza delete records with different timestamp field where specific fields are the same
我有一个 netezza table,更新时数据可能会重叠,但是每个关联的 timestamp
字段会有所不同。例如:
+-----------------+----------+-------------+-----+
| ts | first_nm | last_nm | val |
+-----------------+----------+-------------+-----+
| 4/1/2015 4:15pm | ben | bloomington | 12 |
| 4/1/2015 4:20pm | ben | bloomington | 4.5 |
| 4/1/2015 4:20pm | andrew | bloomberg | 2.8 |
+-----------------+----------+-------------+-----+
我想保留以下记录并删除 ben bloomington 较早的时间戳:
+-----------------+----------+-------------+-----+
| ts | first_nm | last_nm | val |
+-----------------+----------+-------------+-----+
| 4/1/2015 4:20pm | ben | bloomington | 4.5 |
| 4/1/2015 4:20pm | andrew | bloomberg | 2.8 |
+-----------------+----------+-------------+-----+
所以,基于 first_nm
和 last_nm
是不同的,我怎样才能保持最新的 ts 和最新的值?
我想我可以使用 row_number()
函数,但我不确定如何在我的 delete
语句中实现它。
您可以使用以下示例删除所有不是最新时间戳的行。我添加了窗口函数 row_number()
作为示例。
delete from <table>
where rowid in
(
select rwid
from ( select rowid as rwid
, row_number() over(partition by first_nm,last_nm order by ts desc) as rown
from <table>
) sub
where sub.rown>1
);
应该做同样的较短的解决方案是:
DELETE FROM table t
WHERE
EXISTS (SELECT * FROM table
WHERE t.rwid < rwid
AND t.first_nm = first_nm
AND t.last_nm = last_nm)
我有一个 netezza table,更新时数据可能会重叠,但是每个关联的 timestamp
字段会有所不同。例如:
+-----------------+----------+-------------+-----+ | ts | first_nm | last_nm | val | +-----------------+----------+-------------+-----+ | 4/1/2015 4:15pm | ben | bloomington | 12 | | 4/1/2015 4:20pm | ben | bloomington | 4.5 | | 4/1/2015 4:20pm | andrew | bloomberg | 2.8 | +-----------------+----------+-------------+-----+
我想保留以下记录并删除 ben bloomington 较早的时间戳:
+-----------------+----------+-------------+-----+ | ts | first_nm | last_nm | val | +-----------------+----------+-------------+-----+ | 4/1/2015 4:20pm | ben | bloomington | 4.5 | | 4/1/2015 4:20pm | andrew | bloomberg | 2.8 | +-----------------+----------+-------------+-----+
所以,基于 first_nm
和 last_nm
是不同的,我怎样才能保持最新的 ts 和最新的值?
我想我可以使用 row_number()
函数,但我不确定如何在我的 delete
语句中实现它。
您可以使用以下示例删除所有不是最新时间戳的行。我添加了窗口函数 row_number()
作为示例。
delete from <table>
where rowid in
(
select rwid
from ( select rowid as rwid
, row_number() over(partition by first_nm,last_nm order by ts desc) as rown
from <table>
) sub
where sub.rown>1
);
应该做同样的较短的解决方案是:
DELETE FROM table t
WHERE
EXISTS (SELECT * FROM table
WHERE t.rwid < rwid
AND t.first_nm = first_nm
AND t.last_nm = last_nm)