SQL - 过滤多余的重复项

SQL - Filter redundant duplicates

我有一个 table "Conflicts" 包含两个进程 ID(processIDA int,ProcessIDB int)。

当 2 个进程以任何顺序(A/B 或 B/A)输入此 "Conflicts" table.

时,定义唯一冲突

冲突 table 包含重复项,如下所示:

[行..1] ProcessIDA=5, ProcessIDB=6

[行..2] ProcessIDB=6, ProcessIDA=5

我需要做的是过滤掉重复的冲突,这样我就只剩下:

[行..1] ProcessIDA=5, ProcessIDB=6

注意:table 的行可能有 5 到 5000 万条记录。一旦我成功过滤掉重复项,行数将正好是当前行数的一半。

您可以进行简单的自我加入

;WITH   Conflicts   AS
(
    SELECT      *
    FROM    (   VALUES
                (6, 5),
                (5, 6),
                (1, 2),
                (1, 3)
            )   Sample (ProcessIDA, ProcessIDB)
)
SELECT  A.*
FROM    Conflicts A
JOIN    Conflicts B
    ON  A.ProcessIDA = B.ProcessIDB AND
        A.ProcessIDB = B.ProcessIDA

如果要删除重复项,则

查询

;with cte as
(
  select *,
  case when ProcessIDA < ProcessIDB
  then ProcessIDA else ProcessIDB end as column1,
  case when ProcessIDA < ProcessIDB
  then ProcessIDB else ProcessIDA end as column2
  from conflicts
),
cte2 as
(
    select  rn = row_number() over
    (
        partition by cte.column1,cte.column2
        order by cte.column1
    ),*
    from cte
)
delete from cte2
where rn > 1;

SQL Fiddle