T-SQL MERGE 不会按预期处理 NULL
T-SQL MERGE won't handle NULLs as expected
如果这个查询:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
returns 11 条记录和此查询:
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
returns 17条记录,两者相交:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
returns 9 条记录,当我像这样进行 MERGE 时:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3,
MAX(LastDate) AS LastDate
FROM dbo.SourceTbl
GROUP BY OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine), SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3
) AS SOURCE
ON CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine) = OrderNoLine
AND TARGET.ProdOrder = SOURCE.ProdOrder
AND TARGET.Lvl1 = SOURCE.Lvl1
AND TARGET.Lvl2 = SOURCE.Lvl2
AND TARGET.Lvl3 = SOURCE.Lvl3
AND TARGET.LastDate = SOURCE.LastDate
WHEN MATCHED AND EXISTS (SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine)
,SOURCE.ProdOrder
,SOURCE.Lvl1
,SOURCE.Lvl2
,SOURCE.Lvl3
,SOURCE.LastDate
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine)
,TARGET.ProdOrder
,TARGET.Lvl1
,TARGET.Lvl2
,TARGET.Lvl3
,TARGET.LastDate
)
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (SOURCE.LastDate, 999, 999, SOURCE.OrderNo, SOURCE.OrderLine, SOURCE.SomeModel, SOURCE.ProdOrder, SOURCE.Lvl1, SOURCE.Lvl2, SOURCE.Lvl3, 1, GETDATE());
根据this and this,它应该更新 TargetTbl 的 9 条 INTERSECT 记录,并插入相同的 table 来自 SourceTbl 的其余 2 条记录(总共 11 条)。相反,它更新 4 条记录并插入 6 条记录(总共 10 条)。 SourceTbl 中的两条记录重复,这就是 10 而不是 11 的原因,这也是我使用 MAX & GROUP BY 的原因。
我认为这是查询的第一部分,即 USING 部分,它无法正确处理 NULL,即使 INTERSECT 部分完成了它的工作。我尽我所能,但没有成功。我确定这很容易做到,所以请帮助我。谢谢。
编辑:使用 SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate AS LastDate FROM dbo.SourceTbl ORDER BY OrderNo, OrderLine, SomeModel, ProdOrder
的 SourceTbl 数据,省略了不相关的列:
OrderNo OrderLine OrderNoLine SomeModel ProdOrder Lvl1 Lvl2 Lvl3 LastDate
123c08637 10 123c08637_10 4321525175_004321 A5C008837 Abcd Efgh Olol 04/03/2030
123c11214 10 123c11214_10 4321532622_000391 NULL NULL NULL NULL 07/07/2018
123c13039 10 123c13039_10 4321525175_002611 A5C014838 NULL NULL NULL 18/05/2018
123c16059 10 123c16059_10 4321541488_001111 A5C018611 NULL NULL NULL 18/05/2018
123c17482 10 123c17482_10 4321506480_001711 A5C019227 Asdf Ghjk Cvnm 12/12/2018
123c17482 10 123c17482_10 4321506480_001711 A5C047712 Asdf Ghjk Cvnm 12/12/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb xxxx 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B200472 NULL NULL NULL 18/05/2018
123c32405 10 123c32405_10 8765525667_005301 NULL Qwer Uiop Tygh 12/12/2018
GROUP BY 可能会将记录数减少到只有一条(如果 11 条记录仅在 LastDate 列不同,并且如果 SomeModel 包含所有 11 条记录的相同值)或者它可能导致所有 11 条记录(如果 SomeModel 包含唯一值),因此 GROUP BY 不必返回 10 个不同的行。为此,请使用 SELECT DISTINCT 而不是按列的子集进行分组。
此外,如果 ON 条件如您预期的那样工作,则附加的 EXISTS 条件已过时。显然,找到了 4 个匹配项,而 6 个记录没有匹配项。在这 6 条记录中,可能有 2 条确实没有匹配的记录和 4 条记录因为 NULL 值而不匹配。
为了处理 NULL 值,我建议将整个语句更改为如下内容:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT DISTINCT OrderNo, OrderLine, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate
FROM dbo.SourceTbl
) AS SOURCE
ON (TARGET.OrderNo = SOURCE.OrderNo OR TARGET.OrderNo IS NULL AND SOURCE.OrderNo IS NULL)
AND (TARGET.OrderLine = SOURCE.OrderLine OR TARGET.OrderLine IS NULL AND SOURCE.OrderLine IS NULL)
AND (TARGET.ProdOrder = SOURCE.ProdOrder OR TARGET.ProdOrder IS NULL AND SOURCE.ProdOrder IS NULL)
AND (TARGET.Lvl1 = SOURCE.Lvl1 OR TARGET.Lvl1 IS NULL AND SOURCE.Lvl1 IS NULL)
AND (TARGET.Lvl2 = SOURCE.Lvl2 OR TARGET.Lvl2 IS NULL AND SOURCE.Lvl2 IS NULL)
AND (TARGET.Lvl3 = SOURCE.Lvl3 OR TARGET.Lvl3 IS NULL AND SOURCE.Lvl3 IS NULL)
AND (TARGET.LastDate = SOURCE.LastDate OR TARGET.LastDate IS NULL AND SOURCE.LastDate IS NULL)
WHEN MATCHED
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (LastDate, 999, 999, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, 1, GETDATE());
SQL 语言的某些功能使用了区别性概念(特别是 DISTINCT
和 GROUP BY
),值得注意的是 NULL IS NOT DISTINCT FROM NULL
为真。这也出现在 UNION (ALL)
、EXCEPT
、INTERSECT
等
中
不幸的是,SQL 服务器本身并没有实现标准 SQL 中的 IS (NOT) DISTINCT FROM
运算符;因此,您只能使用相等比较,其中著名的 SQL、NULL = NULL
是未知的(不是真或假)。因此,您必须在 ON
子句中明确执行 NULL
检查(直到 SQL 服务器的未来版本支持 DISTINCT FROM
运算符)
如果这个查询:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
returns 11 条记录和此查询:
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
returns 17条记录,两者相交:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
returns 9 条记录,当我像这样进行 MERGE 时:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3,
MAX(LastDate) AS LastDate
FROM dbo.SourceTbl
GROUP BY OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine), SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3
) AS SOURCE
ON CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine) = OrderNoLine
AND TARGET.ProdOrder = SOURCE.ProdOrder
AND TARGET.Lvl1 = SOURCE.Lvl1
AND TARGET.Lvl2 = SOURCE.Lvl2
AND TARGET.Lvl3 = SOURCE.Lvl3
AND TARGET.LastDate = SOURCE.LastDate
WHEN MATCHED AND EXISTS (SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine)
,SOURCE.ProdOrder
,SOURCE.Lvl1
,SOURCE.Lvl2
,SOURCE.Lvl3
,SOURCE.LastDate
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine)
,TARGET.ProdOrder
,TARGET.Lvl1
,TARGET.Lvl2
,TARGET.Lvl3
,TARGET.LastDate
)
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (SOURCE.LastDate, 999, 999, SOURCE.OrderNo, SOURCE.OrderLine, SOURCE.SomeModel, SOURCE.ProdOrder, SOURCE.Lvl1, SOURCE.Lvl2, SOURCE.Lvl3, 1, GETDATE());
根据this and this,它应该更新 TargetTbl 的 9 条 INTERSECT 记录,并插入相同的 table 来自 SourceTbl 的其余 2 条记录(总共 11 条)。相反,它更新 4 条记录并插入 6 条记录(总共 10 条)。 SourceTbl 中的两条记录重复,这就是 10 而不是 11 的原因,这也是我使用 MAX & GROUP BY 的原因。
我认为这是查询的第一部分,即 USING 部分,它无法正确处理 NULL,即使 INTERSECT 部分完成了它的工作。我尽我所能,但没有成功。我确定这很容易做到,所以请帮助我。谢谢。
编辑:使用 SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate AS LastDate FROM dbo.SourceTbl ORDER BY OrderNo, OrderLine, SomeModel, ProdOrder
的 SourceTbl 数据,省略了不相关的列:
OrderNo OrderLine OrderNoLine SomeModel ProdOrder Lvl1 Lvl2 Lvl3 LastDate
123c08637 10 123c08637_10 4321525175_004321 A5C008837 Abcd Efgh Olol 04/03/2030
123c11214 10 123c11214_10 4321532622_000391 NULL NULL NULL NULL 07/07/2018
123c13039 10 123c13039_10 4321525175_002611 A5C014838 NULL NULL NULL 18/05/2018
123c16059 10 123c16059_10 4321541488_001111 A5C018611 NULL NULL NULL 18/05/2018
123c17482 10 123c17482_10 4321506480_001711 A5C019227 Asdf Ghjk Cvnm 12/12/2018
123c17482 10 123c17482_10 4321506480_001711 A5C047712 Asdf Ghjk Cvnm 12/12/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb xxxx 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B200472 NULL NULL NULL 18/05/2018
123c32405 10 123c32405_10 8765525667_005301 NULL Qwer Uiop Tygh 12/12/2018
GROUP BY 可能会将记录数减少到只有一条(如果 11 条记录仅在 LastDate 列不同,并且如果 SomeModel 包含所有 11 条记录的相同值)或者它可能导致所有 11 条记录(如果 SomeModel 包含唯一值),因此 GROUP BY 不必返回 10 个不同的行。为此,请使用 SELECT DISTINCT 而不是按列的子集进行分组。
此外,如果 ON 条件如您预期的那样工作,则附加的 EXISTS 条件已过时。显然,找到了 4 个匹配项,而 6 个记录没有匹配项。在这 6 条记录中,可能有 2 条确实没有匹配的记录和 4 条记录因为 NULL 值而不匹配。
为了处理 NULL 值,我建议将整个语句更改为如下内容:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT DISTINCT OrderNo, OrderLine, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate
FROM dbo.SourceTbl
) AS SOURCE
ON (TARGET.OrderNo = SOURCE.OrderNo OR TARGET.OrderNo IS NULL AND SOURCE.OrderNo IS NULL)
AND (TARGET.OrderLine = SOURCE.OrderLine OR TARGET.OrderLine IS NULL AND SOURCE.OrderLine IS NULL)
AND (TARGET.ProdOrder = SOURCE.ProdOrder OR TARGET.ProdOrder IS NULL AND SOURCE.ProdOrder IS NULL)
AND (TARGET.Lvl1 = SOURCE.Lvl1 OR TARGET.Lvl1 IS NULL AND SOURCE.Lvl1 IS NULL)
AND (TARGET.Lvl2 = SOURCE.Lvl2 OR TARGET.Lvl2 IS NULL AND SOURCE.Lvl2 IS NULL)
AND (TARGET.Lvl3 = SOURCE.Lvl3 OR TARGET.Lvl3 IS NULL AND SOURCE.Lvl3 IS NULL)
AND (TARGET.LastDate = SOURCE.LastDate OR TARGET.LastDate IS NULL AND SOURCE.LastDate IS NULL)
WHEN MATCHED
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (LastDate, 999, 999, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, 1, GETDATE());
SQL 语言的某些功能使用了区别性概念(特别是 DISTINCT
和 GROUP BY
),值得注意的是 NULL IS NOT DISTINCT FROM NULL
为真。这也出现在 UNION (ALL)
、EXCEPT
、INTERSECT
等
不幸的是,SQL 服务器本身并没有实现标准 SQL 中的 IS (NOT) DISTINCT FROM
运算符;因此,您只能使用相等比较,其中著名的 SQL、NULL = NULL
是未知的(不是真或假)。因此,您必须在 ON
子句中明确执行 NULL
检查(直到 SQL 服务器的未来版本支持 DISTINCT FROM
运算符)