评估同一行的多个条件
Evaluate Multiple conditions for same row
我必须比较 2 个不同的来源并找出所有 IDs
的所有不匹配之处
Source_excel
table
+-----+-------------+------+----------+
| id | name | City | flag |
+-----+-------------+------+----------+
| 101 | Plate | NY | Ready |
| 102 | Back washer | NY | Sold |
| 103 | Ring | MC | Planning |
| 104 | Glass | NMC | Ready |
| 107 | Cover | PR | Ready |
+-----+-------------+------+----------+
Source_dw
table
+-----+----------+------+----------+
| id | name | City | flag |
+-----+----------+------+----------+
| 101 | Plate | NY | Planning |
| 102 | Nut | TN | Expired |
| 103 | Ring | MC | Planning |
| 104 | Top Wire | NY | Ready |
| 105 | Bolt | MC | Expired |
+-----+----------+------+----------+
预期结果
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| ID | excel_name | dw_name | excel_flag | dw_flag | excel_city | dw_city | RESULT |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| 101 | Plate | Plate | Ready | Planning | NY | NY | FLAG_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | NAME_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | FLAG_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | CITY_MISMATCH |
| 103 | Ring | Ring | Planning | Planning | MC | MC | ALL_MATCH |
| 104 | Glass | Top Wire | Ready | Ready | NMC | NY | NAME_MISMATCH |
| 104 | Glass | Top Wire | Ready | Ready | NMC | NY | CITY_MISMATCH |
| 107 | Cover | | Ready | | PR | | MISSING IN DW |
| 105 | | Bolt | | Expired | | MC | MISSING IN EXCEL |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
我试过下面的查询,但它只给出了一个不匹配的地方。
select ISNULL(EXCEL.ID,DW.ID) ID,
excel.name as excel_name,dw.name as dw_name,
excel.flag as excel_flag,dw.flag as dw_flag,
excel.city as excel_city,dw.city as dw_city,
RESULT = CASE WHEN excel.ID IS NULL THEN 'MISSING IN EXCEL'
WHEN dw.ID IS NULL THEN 'MISSING IN DW'
WHEN excel.NAME<>dw.NAME THEN 'NAME_MISMATCH'
WHEN excel.CITY<>dw.CITY THEN 'CITY_MISMATCH'
WHEN excel.FLAG <> dw.FLAG THEN 'FLAG_MISMATCH'
ELSE 'ALL_MATCH' END
from source_excel excel
FULL OUTER JOIN source_dw dw ON excel.id=dw.id
实际产量
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| ID | excel_name | dw_name | excel_flag | dw_flag | excel_city | dw_city | RESULT |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| 101 | Plate | Plate | Ready | Planning | NY | NY | FLAG_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | NAME_MISMATCH |
| 103 | Ring | Ring | Planning | Planning | MC | MC | ALL_MATCH |
| 104 | Glass | Top Wire | Ready | Ready | NMC | NY | NAME_MISMATCH |
| 107 | Cover | | Ready | | PR | | MISSING IN DW |
| 105 | | Bolt | | Expired | | MC | MISSING IN EXCEL |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
我明白 case
表达式只会检查第一个满足的条件。有没有其他方法可以检查所有情况?
如果我没听错的话,你希望每个不匹配一行,或者一行表示所有匹配。
您可以使用 cross apply
生成行,如下所示:
SELECT
COALESCE(xl.ID, dw.ID) ID,
xl.name as excel_name,dw.name as dw_name,
xl.flag as excel_flag,dw.flag as dw_flag,
xl.city as excel_city,dw.city as dw_city,
x.result
FROM source_excel xl
FULL OUTER JOIN source_dw dw ON xl.id = dw.id
CROSS APPLY (VALUES
(CASE WHEN xl.ID IS NULL THEN 'MISSING IN EXCEL' END),
(CASE WHEN dw.ID IS NULL THEN 'MISSING IN DW' END),
(CASE WHEN WHEN xl.NAME <> dw.NAME THEN 'NAME_MISMATCH' END),
(CASE WHEN xl.CITY <> dw.CITY THEN 'CITY_MISMATCH' END),
(CASE WHEN xl.FLAG <> dw.FLAG THEN 'FLAG_MISMATCH' END),
(CASE WHEN
xl.ID = dw.ID
AND xl.NAME = dw.NAME
AND xl.CITY = dw.CITY
AND xl.FLAG = dw.FLAG
THEN 'ALL_MATCH' END)
) x(result)
WHERE x.result IS NOT NULL
我会将不匹配的部分合并成一行,将原因串联在一起:
select COALESCE(EXCEL.ID, DW.ID) as ID,
excel.name as excel_name,dw.name as dw_name,
excel.flag as excel_flag,dw.flag as dw_flag,
excel.city as excel_city,dw.city as dw_city,
(CASE WHEN excel.ID IS NULL
THEN 'MISSING IN EXCEL'
WHEN dw.ID IS NULL
THEN 'MISSING IN DW'
WHEN excel.NAME = dw.NAME AND excel.CITY = dw.CITY AND excel.FLAG = dw.FLAG
THEN 'ALL MATCH'
ELSE CONCAT(CASE WHEN excel.NAME <> dw.NAME THEN 'NAME_MISMATCH; ' END,
CASE WHEN excel.CITY <> dw.CITY THEN 'CITY_MISMATCH; ' END,
CASE WHEN excel.FLAG <> dw.FLAG THEN 'FLAG_MISMATCH;' END
)
END)
from source_excel excel FULL OUTER JOIN
source_dw dw
ON excel.id = dw.id;
我必须比较 2 个不同的来源并找出所有 IDs
Source_excel
table
+-----+-------------+------+----------+
| id | name | City | flag |
+-----+-------------+------+----------+
| 101 | Plate | NY | Ready |
| 102 | Back washer | NY | Sold |
| 103 | Ring | MC | Planning |
| 104 | Glass | NMC | Ready |
| 107 | Cover | PR | Ready |
+-----+-------------+------+----------+
Source_dw
table
+-----+----------+------+----------+
| id | name | City | flag |
+-----+----------+------+----------+
| 101 | Plate | NY | Planning |
| 102 | Nut | TN | Expired |
| 103 | Ring | MC | Planning |
| 104 | Top Wire | NY | Ready |
| 105 | Bolt | MC | Expired |
+-----+----------+------+----------+
预期结果
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| ID | excel_name | dw_name | excel_flag | dw_flag | excel_city | dw_city | RESULT |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| 101 | Plate | Plate | Ready | Planning | NY | NY | FLAG_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | NAME_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | FLAG_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | CITY_MISMATCH |
| 103 | Ring | Ring | Planning | Planning | MC | MC | ALL_MATCH |
| 104 | Glass | Top Wire | Ready | Ready | NMC | NY | NAME_MISMATCH |
| 104 | Glass | Top Wire | Ready | Ready | NMC | NY | CITY_MISMATCH |
| 107 | Cover | | Ready | | PR | | MISSING IN DW |
| 105 | | Bolt | | Expired | | MC | MISSING IN EXCEL |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
我试过下面的查询,但它只给出了一个不匹配的地方。
select ISNULL(EXCEL.ID,DW.ID) ID,
excel.name as excel_name,dw.name as dw_name,
excel.flag as excel_flag,dw.flag as dw_flag,
excel.city as excel_city,dw.city as dw_city,
RESULT = CASE WHEN excel.ID IS NULL THEN 'MISSING IN EXCEL'
WHEN dw.ID IS NULL THEN 'MISSING IN DW'
WHEN excel.NAME<>dw.NAME THEN 'NAME_MISMATCH'
WHEN excel.CITY<>dw.CITY THEN 'CITY_MISMATCH'
WHEN excel.FLAG <> dw.FLAG THEN 'FLAG_MISMATCH'
ELSE 'ALL_MATCH' END
from source_excel excel
FULL OUTER JOIN source_dw dw ON excel.id=dw.id
实际产量
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| ID | excel_name | dw_name | excel_flag | dw_flag | excel_city | dw_city | RESULT |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
| 101 | Plate | Plate | Ready | Planning | NY | NY | FLAG_MISMATCH |
| 102 | Back washer | Nut | Sold | Expired | NY | TN | NAME_MISMATCH |
| 103 | Ring | Ring | Planning | Planning | MC | MC | ALL_MATCH |
| 104 | Glass | Top Wire | Ready | Ready | NMC | NY | NAME_MISMATCH |
| 107 | Cover | | Ready | | PR | | MISSING IN DW |
| 105 | | Bolt | | Expired | | MC | MISSING IN EXCEL |
+-----+-------------+----------+------------+----------+------------+---------+------------------+
我明白 case
表达式只会检查第一个满足的条件。有没有其他方法可以检查所有情况?
如果我没听错的话,你希望每个不匹配一行,或者一行表示所有匹配。
您可以使用 cross apply
生成行,如下所示:
SELECT
COALESCE(xl.ID, dw.ID) ID,
xl.name as excel_name,dw.name as dw_name,
xl.flag as excel_flag,dw.flag as dw_flag,
xl.city as excel_city,dw.city as dw_city,
x.result
FROM source_excel xl
FULL OUTER JOIN source_dw dw ON xl.id = dw.id
CROSS APPLY (VALUES
(CASE WHEN xl.ID IS NULL THEN 'MISSING IN EXCEL' END),
(CASE WHEN dw.ID IS NULL THEN 'MISSING IN DW' END),
(CASE WHEN WHEN xl.NAME <> dw.NAME THEN 'NAME_MISMATCH' END),
(CASE WHEN xl.CITY <> dw.CITY THEN 'CITY_MISMATCH' END),
(CASE WHEN xl.FLAG <> dw.FLAG THEN 'FLAG_MISMATCH' END),
(CASE WHEN
xl.ID = dw.ID
AND xl.NAME = dw.NAME
AND xl.CITY = dw.CITY
AND xl.FLAG = dw.FLAG
THEN 'ALL_MATCH' END)
) x(result)
WHERE x.result IS NOT NULL
我会将不匹配的部分合并成一行,将原因串联在一起:
select COALESCE(EXCEL.ID, DW.ID) as ID,
excel.name as excel_name,dw.name as dw_name,
excel.flag as excel_flag,dw.flag as dw_flag,
excel.city as excel_city,dw.city as dw_city,
(CASE WHEN excel.ID IS NULL
THEN 'MISSING IN EXCEL'
WHEN dw.ID IS NULL
THEN 'MISSING IN DW'
WHEN excel.NAME = dw.NAME AND excel.CITY = dw.CITY AND excel.FLAG = dw.FLAG
THEN 'ALL MATCH'
ELSE CONCAT(CASE WHEN excel.NAME <> dw.NAME THEN 'NAME_MISMATCH; ' END,
CASE WHEN excel.CITY <> dw.CITY THEN 'CITY_MISMATCH; ' END,
CASE WHEN excel.FLAG <> dw.FLAG THEN 'FLAG_MISMATCH;' END
)
END)
from source_excel excel FULL OUTER JOIN
source_dw dw
ON excel.id = dw.id;