加入 2 个表时重复
Duplicates on joining 2 tables
我有一个包含大量数据的 Select 语句,超过 23K,它与超过 5 个 table 结合在一起,当我查看 table 我看到我正在做的连接有 2 行匹配,
我试过按 Group by 但没用,我也试过 Select DISTINCT 但也没用,我该怎么做?
SELECT DISTINCT FirstName, LastName, F.Interview FROM tbData D
LEFT JOIN tbInterview F on D.UserID = F.UserID
where CreatedDate is between '' and ''
返回重复项是因为 tbInterview 有多个采访链接了 USerID,然后我尝试了这个
SELECT DISTINCT FirstName, LastName, F.Interview FROM tbData D
LEFT JOIN (Select UserID from tbInterview GROUP BY UserID) as InterviewID ON D.UserID = InterviewID.UserID
LEFT JOIN tbInterview F on InterviewID.UserID = F.UserID
where CreatedDate is between '' and ''
这个也没用。
这是 table tbInterview 中的数据示例
╔═════════════╤════════╤═════════════╤═════════════════════════════╗
║ InterViewID │ UserID │ DateCreated │ Interview ║
╠═════════════╪════════╪═════════════╪═════════════════════════════╣
║ 1 │ 120 │ 2015/05/10 │ Inter View Done ║
╟─────────────┼────────┼─────────────┼─────────────────────────────╢
║ 2 │ 120 │ 2015/05/15 │ 2nd Interview was requested ║
╚═════════════╧════════╧═════════════╧═════════════════════════════╝
现在,当我对 tbInterview 执行 Select 连接时,它显示输出如下:
╔═══════════╤══════════╤═════════════════════════════╗
║ FirstName │ LastName │ Interview ║
╠═══════════╪══════════╪═════════════════════════════╣
║ James │ Smith │ Inter View Done ║
╟───────────┼──────────┼─────────────────────────────╢
║ James │ Smith │ 2nd Interview was requested ║
╚═══════════╧══════════╧═════════════════════════════╝
假设您要在 tbInterview 中查找最新创建日期的行,那么您有几种可能的解决方案。这是一个基于您的第一个查询的与 ROWNUMBER() 函数一起使用的函数:
这假设您实际上只需要 CreatedDate 值(假设在 tbData table 上,因为 DateCreated 在 tbInterview 上被命名为等效值)在示例中指定的范围内的信息查询。
SELECT
FirstName,
LastName,
F.Interview
FROM tbData D
LEFT JOIN (
SELECT
UserID,
CreatedDate,
ROW_NUMBER() OVER (Partition By UserId ORDER BY DateCreated Desc) as RowNumber
FROM tbInterview F
WHERE DateCreated > '' --- Earliest date created for range on tbData
) as F
on D.UserID = F.UserID
WHERE CreatedDate is between '' and ''
AND RowNumber = 1
这会将最新的采访行(基于 DateCreated)与 tbData 用户相关联。
我有一个包含大量数据的 Select 语句,超过 23K,它与超过 5 个 table 结合在一起,当我查看 table 我看到我正在做的连接有 2 行匹配,
我试过按 Group by 但没用,我也试过 Select DISTINCT 但也没用,我该怎么做?
SELECT DISTINCT FirstName, LastName, F.Interview FROM tbData D
LEFT JOIN tbInterview F on D.UserID = F.UserID
where CreatedDate is between '' and ''
返回重复项是因为 tbInterview 有多个采访链接了 USerID,然后我尝试了这个
SELECT DISTINCT FirstName, LastName, F.Interview FROM tbData D
LEFT JOIN (Select UserID from tbInterview GROUP BY UserID) as InterviewID ON D.UserID = InterviewID.UserID
LEFT JOIN tbInterview F on InterviewID.UserID = F.UserID
where CreatedDate is between '' and ''
这个也没用。
这是 table tbInterview 中的数据示例
╔═════════════╤════════╤═════════════╤═════════════════════════════╗
║ InterViewID │ UserID │ DateCreated │ Interview ║
╠═════════════╪════════╪═════════════╪═════════════════════════════╣
║ 1 │ 120 │ 2015/05/10 │ Inter View Done ║
╟─────────────┼────────┼─────────────┼─────────────────────────────╢
║ 2 │ 120 │ 2015/05/15 │ 2nd Interview was requested ║
╚═════════════╧════════╧═════════════╧═════════════════════════════╝
现在,当我对 tbInterview 执行 Select 连接时,它显示输出如下:
╔═══════════╤══════════╤═════════════════════════════╗
║ FirstName │ LastName │ Interview ║
╠═══════════╪══════════╪═════════════════════════════╣
║ James │ Smith │ Inter View Done ║
╟───────────┼──────────┼─────────────────────────────╢
║ James │ Smith │ 2nd Interview was requested ║
╚═══════════╧══════════╧═════════════════════════════╝
假设您要在 tbInterview 中查找最新创建日期的行,那么您有几种可能的解决方案。这是一个基于您的第一个查询的与 ROWNUMBER() 函数一起使用的函数:
这假设您实际上只需要 CreatedDate 值(假设在 tbData table 上,因为 DateCreated 在 tbInterview 上被命名为等效值)在示例中指定的范围内的信息查询。
SELECT
FirstName,
LastName,
F.Interview
FROM tbData D
LEFT JOIN (
SELECT
UserID,
CreatedDate,
ROW_NUMBER() OVER (Partition By UserId ORDER BY DateCreated Desc) as RowNumber
FROM tbInterview F
WHERE DateCreated > '' --- Earliest date created for range on tbData
) as F
on D.UserID = F.UserID
WHERE CreatedDate is between '' and ''
AND RowNumber = 1
这会将最新的采访行(基于 DateCreated)与 tbData 用户相关联。