SQL: 删除值已存在的行

SQL: Deleting row which values already exist

我有一个 table 看起来像这样:

ID | DATE       | NAME   | VALUE_1 | VALUE_2
1  | 27.11.2015 | Homer  | A       | B
2  | 27.11.2015 | Bart   | C       | B
3  | 28.11.2015 | Homer  | A       | C
4  | 28.11.2015 | Maggie | C       | B
5  | 28.11.2015 | Bart   | C       | B

我目前使用以下代码删除重复行(感谢 this thread):

WITH cte AS
(SELECT ROW_NUMBER() OVER (PARTITION BY [VALUE_1], [VALUE_2]
ORDER BY [DATE] DESC) RN
FROM [MY_TABLE])
DELETE FROM cte
WHERE RN > 1

但是这段代码并没有完全删除我想要的行。我只想删除值已经存在的行,所以在我的示例中我只想删除第 5 行,因为第 2 行具有相同的值并且更旧。

创建我的 table 和插入值的代码:

CREATE TABLE [t_diff_values]
([id] INT IDENTITY NOT NULL PRIMARY KEY,
[date] DATETIME NOT NULL,
[name] VARCHAR(255) NOT NULL DEFAULT '',
[val1] CHAR(1) NOT NULL DEFAULT '',
[val2] CHAR(1) NOT NULL DEFAULT '');

INSERT INTO [t_diff_values] ([date], [name], [val1], [val2]) VALUES
('2015-11-27','Homer',  'A','B'),
('2015-11-27','Bart',   'C','B'),
('2015-11-28','Homer',  'A','C'),
('2015-11-28','Maggie', 'C','B'),
('2015-11-28','Bart',   'C','B');

您需要再添加一个 CTE,您将在其中索引所有岛屿,然后在第二个 CTE:

中应用您的重复逻辑
DECLARE @t TABLE
    (
      ID INT ,
      DATE DATE ,
      VALUE_1 CHAR(1) ,
      VALUE_2 CHAR(1)
    )

INSERT  INTO @t
VALUES  ( 1, '20151127', 'A', 'B' ),
        ( 2, '20151128', 'C', 'B' ),
        ( 3, '20151129', 'A', 'B' ),
        ( 4, '20151130', 'A', 'B' );
WITH    cte1
          AS ( SELECT   * ,
                        ROW_NUMBER() OVER ( ORDER BY date)
                        - ROW_NUMBER() OVER ( PARTITION BY VALUE_1, VALUE_2 ORDER BY DATE) AS gr
               FROM     @t
             ),
        cte2
          AS ( SELECT   * ,
                        ROW_NUMBER() OVER ( PARTITION BY VALUE_1, VALUE_2, gr ORDER BY date) AS rn
               FROM     cte1
             )
    DELETE  FROM cte2
    WHERE   rn > 1

SELECT  *
FROM    @t

您可以使用此查询:

WITH cte AS
(
    SELECT RN = ROW_NUMBER() OVER (ORDER BY ID) 
    , *
    FROM @data
)
DELETE FROM c1
--SELECT * 
FROM CTE c1
INNER JOIN CTE c2 ON c1.RN +1 = c2.RN AND c1.VALUE_1 = c2.VALUE_1 AND c1.VALUE_2 = c2.VALUE_2

这里我按ID排序。如果下一个(RN+1)有相似的V1和V2,则删除。

输出:

ID  DATE        VALUE_1 VALUE_2
1   2015-11-27  A       B
2   2015-11-28  C       B
4   2015-11-30  A       B

数据:

declare @data table(ID int, [DATE] date, VALUE_1 char(1), VALUE_2 char(1));
insert into @data(ID, [DATE], VALUE_1, VALUE_2) values
(1, '20151127', 'A', 'B'), 
(2, '20151128', 'C', 'B'), 
(3, '20151129', 'A', 'B'), 
(4, '20151130', 'A', 'B');

试试这个

CREATE TABLE [dbo].[Employee](
    [ID] INT NOT NULL,
    [Date] DateTime NOT NULL,
    [VAL1] varchar(20) NOT NULL,
    [VAL2] varchar(20) NOT NULL
)

INSERT INTO [dbo].[Employee] VALUES 
            (1,'2015-11-27 10:44:33.087','A','B')
INSERT INTO [dbo].[Employee] VALUES
            (2,'2015-11-28 10:44:33.087','C','B')
INSERT INTO [dbo].[Employee] VALUES
            (3,'2015-11-29 10:44:33.087','A','B') 
INSERT INTO [dbo].[Employee] VALUES
            (4,'2015-11-30 10:44:33.087','A','B')

with cte as(
    select
        *,
        rn = row_number() over(partition by [VAL1], [VAL2]
ORDER BY [DATE] DESC),
        cc = count(*) over(partition by [VAL1], [VAL2])
    from [Employee]
)

delete
from cte
where
    rn > 1 and rn < cc

select * from [Employee]