查找不重复的列值组合
Find NON-duplicate column value combinations
这是一个迁移脚本。
公司表:
EmployeeId
DivisionId
abc
div1
def
div1
abc
div1
abc
div2
xyz
div2
在下面的代码中,我正在选择重复的 EmployeeId-DivisionId 组合,也就是说,具有相同 EmployeeId 和 DivisionId 的记录将被 selected。所以从上面的table,具有abc-div1
组合的两行将被下面的代码select编辑。
如何反转它?看起来很简单,但我无法弄清楚。我尝试用 HAVING count(*) = 0
代替 > 1
,我尝试摆弄 ON 和 AND 行中的等号。基本上从上面table,我想select没有的其他三行有abc-div1
组合。如果有办法 select 所有唯一的 EmployeeID-DivisionId 组合,请告诉我。
SELECT a.EmployeeID, a.DivisionId FROM CompanyTable a
JOIN ( SELECT EmployeeID, DivisionId
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
HAVING count(*) > 1 ) b
ON a.EmployeeID = b.EmployeeID
AND a.DivisionId = b.DivisionId;
EmployeeId 和 DivisionId 都是 nvarchar(50) 列。
如前所述,您必须将 > 1 替换为其真正相反的 <= 1,这有效:db<>fiddle
窗口计数似乎是一种合适的方法:
select employeeid, divisionid
from (
select *, Count(*) over(partition by employeeid, divisionid) ct
from t
)t
where ct = 1;
首先,让我们尝试使用通用 table 表达式 (CTE) 而不是子查询来重写您的查询:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 1
注意到我是如何删除 HAVING
子句并添加新的 HasDuplicates
列的吗?我们将使用该新列来查找 -DON'T- 具有重复项的所有 table 行:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 0
SQL 代码中唯一在两个查询之间发生变化的字符是最后一行,其中设置了 and cts.HasDuplicates = ###
。
这是一个迁移脚本。
公司表:
EmployeeId | DivisionId |
---|---|
abc | div1 |
def | div1 |
abc | div1 |
abc | div2 |
xyz | div2 |
在下面的代码中,我正在选择重复的 EmployeeId-DivisionId 组合,也就是说,具有相同 EmployeeId 和 DivisionId 的记录将被 selected。所以从上面的table,具有abc-div1
组合的两行将被下面的代码select编辑。
如何反转它?看起来很简单,但我无法弄清楚。我尝试用 HAVING count(*) = 0
代替 > 1
,我尝试摆弄 ON 和 AND 行中的等号。基本上从上面table,我想select没有的其他三行有abc-div1
组合。如果有办法 select 所有唯一的 EmployeeID-DivisionId 组合,请告诉我。
SELECT a.EmployeeID, a.DivisionId FROM CompanyTable a
JOIN ( SELECT EmployeeID, DivisionId
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
HAVING count(*) > 1 ) b
ON a.EmployeeID = b.EmployeeID
AND a.DivisionId = b.DivisionId;
EmployeeId 和 DivisionId 都是 nvarchar(50) 列。
如前所述,您必须将 > 1 替换为其真正相反的 <= 1,这有效:db<>fiddle
窗口计数似乎是一种合适的方法:
select employeeid, divisionid
from (
select *, Count(*) over(partition by employeeid, divisionid) ct
from t
)t
where ct = 1;
首先,让我们尝试使用通用 table 表达式 (CTE) 而不是子查询来重写您的查询:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 1
注意到我是如何删除 HAVING
子句并添加新的 HasDuplicates
列的吗?我们将使用该新列来查找 -DON'T- 具有重复项的所有 table 行:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 0
SQL 代码中唯一在两个查询之间发生变化的字符是最后一行,其中设置了 and cts.HasDuplicates = ###
。