更新列 "Count" 以具有其最高重复行 ID 的值,根据 "Url" 列删除重复行

update column "Count" to have the value of its highest duplicate row ID, delete duplicate rows based on the "Url" column

// 更新列“Count”以具有其最高重复行 ID 的值,删除//基于“Url”列的重复行(仅保留具有最低 ID 的行),

| ID | First_Name | Count | Url |
| -- | ---------- | ----- | ---------- |
| 1  | A          | 10    |  www.A.com |
| 2  | B          | 21    |  www.B.com |
| 3  | C          | 12    |  www.C.com |
| 4  | D          | 31    |  www.D.com |
| 5  | A          | 13    |  www.A.com |
| 6  | D          | 18    |  www.D.com |
| 7  | A          | 5     |  www.A.com |

EXPECTED RESULT

| ID | First_Name | Count | Url |
| -- | ---------- | ----- | --------- |
| 1  | A          | 5     | www.A.com |
| 2  | B          | 21    | www.B.com |
| 3  | C          | 12    | www.C.com |
| 4  | D          | 18    | www.D.com |

由于你没有说明这是什么类型的数据库,我只好瞎猜了。但这是 select SQL Server Management Studio v17.9.1 中的最新副本的方法 Fiddle: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=e266cfbca912dc1a541f5f474a0b018d

这就是您可以 select 最新值的方法,使用此逻辑,您可以简单地更新 First_Name 匹配的值,并删除其余值等。

CREATE TABLE #testValues (
   ID int NOT NULL
   ,[First_Name] varchar(255) NOT NULL
   ,[Count]  int NULL
   ,[Url]  varchar(255) NOT NULL
   ) 

INSERT INTO #testvalues (
   ID
   ,[First_Name]
   ,[Count]
   ,[Url]
   )

VALUES (1,'A',10,'www.A.com'),
(2,'B',21,'www.B.com'),
(3,'C',12,'www.C.com'),
(4,'D',31,'www.D.com'),
(5,'A',13,'www.A.com'),
(6,'D',18,'www.D.com'),
(7,'A',5,'www.A.com');

select min(a.id) as 'ID', a.First_Name
into #lowestId
from #testValues a
group by a.first_name

SELECT a.*
into #newValues
from #testValues a
left join #testValues b
on a.First_Name=b.First_Name and a.Url=b.Url and a.id < b.ID
where b.ID is null
order by a.First_Name

select case when b.ID is null then a.ID else b.ID end as 'ID', 
a.count, 
a.first_name, 
a.url
from #newValues a
left join #lowestId b
on a.first_name=b.first_name and b.ID < a.ID
order by case when b.ID is null then a.ID else b.ID end

drop table #lowestId
drop table #newValues
drop table #testValues

我想你想删除按 url 分组的双精度值,所以你可以试试这个:

DELETE t FROM your_table t 
INNER JOIN (
   SELECT Url,MIN(count) AS min_count
   FROM your_table
   GROUP BY Url 
   HAVING COUNT(ID) > 1
) as t2 on t2.Url = t.Url
WHERE t.count > t2.min_count

不改变table结构的另一个:

UPDATE your_table yt SET yt.count = (SELECT max(yt2.id) FROM your_table yt2 WHERE yt2.Url = yt.Url AND yt.id != yt2.id);

DELETE FROM your_table yt WHERE yt.Url IN (SELECT yt2.Url from your_table WHERE yt2.id < yt.id)