查找不重复的列值组合

Find NON-duplicate column value combinations

这是一个迁移脚本。

公司表:

EmployeeId DivisionId
abc div1
def div1
abc div1
abc div2
xyz div2

在下面的代码中,我正在选择重复的 EmployeeId-DivisionId 组合,也就是说,具有相同 EmployeeId 和 DivisionId 的记录将被 selected。所以从上面的table,具有abc-div1组合的两行将被下面的代码select编辑。

如何反转它?看起来很简单,但我无法弄清楚。我尝试用 HAVING count(*) = 0 代替 > 1,我尝试摆弄 ON 和 AND 行中的等号。基本上从上面table,我想select没有的其他三行有abc-div1组合。如果有办法 select 所有唯一的 EmployeeID-DivisionId 组合,请告诉我。

SELECT a.EmployeeID, a.DivisionId FROM CompanyTable a
  JOIN ( SELECT EmployeeID, DivisionId 
         FROM CompanyTable 
         GROUP BY EmployeeID, DivisionId 
         HAVING count(*) > 1 ) b
    ON a.EmployeeID = b.EmployeeID
   AND a.DivisionId = b.DivisionId;

EmployeeId 和 DivisionId 都是 nvarchar(50) 列。

如前所述,您必须将 > 1 替换为其真正相反的 <= 1,这有效:db<>fiddle

窗口计数似乎是一种合适的方法:

select employeeid, divisionid
from (
    select *, Count(*) over(partition by employeeid, divisionid) ct
    from t
)t
where ct = 1;

首先,让我们尝试使用通用 table 表达式 (CTE) 而不是子查询来重写您的查询:

WITH cteCompanyTableStats as (
    SELECT 
        EmployeeID, DivisionId, 
        HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
    FROM CompanyTable 
    GROUP BY EmployeeID, DivisionId 
)
SELECT ct.*
FROM CompanyTable ct
    inner join cteCompanyTableStats cts on
        ct.EmployeeId = cts.EmployeeId 
        and ct.DivisionId = cts.DivisionId
        and cts.HasDuplicates = 1

注意到我是如何删除 HAVING 子句并添加新的 HasDuplicates 列的吗?我们将使用该新列来查找 -DON'T- 具有重复项的所有 table 行:

WITH cteCompanyTableStats as (
    SELECT 
        EmployeeID, DivisionId, 
        HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
    FROM CompanyTable 
    GROUP BY EmployeeID, DivisionId 
)
SELECT ct.*
FROM CompanyTable ct
    inner join cteCompanyTableStats cts on
        ct.EmployeeId = cts.EmployeeId 
        and ct.DivisionId = cts.DivisionId
        and cts.HasDuplicates = 0

SQL 代码中唯一在两个查询之间发生变化的字符是最后一行,其中设置了 and cts.HasDuplicates = ###