如何select每组TOP 1记录(分区)
How to select the TOP 1 record per group (Partition)
我有一个名为 tblRoutes 的 table,它包含一个唯一的往返路线列表(f = from;t = to):
| fCity | fState | tCity | tState |
|========|========|========|========|
|New York| NY | Miami | CA |
|Houston | TX |New York| NY |
...
然后我有一个名为 tblCarrierRates 的 table,它列出了承运人为某些路线旅行提供的一系列等级和费率:
| fCity | fState | tCity | tState | Tier | Rate | CarrID | CarrName |
|========|========|========|========|======|======|========|=============|
|New York| NY | Miami | CA | 2 | .99| ABCD | Abracadabra |
|New York| NY | Miami | CA | 1 | .00| BUMP | Bumpy Rides |
|Houston | TX |New York| NY | 2 | .00| SLOW |Slow Carriers|
|Houston | TX |New York| NY | 2 | .01| ABCD | Abracadabra |
...
对于 tblRoutes 中列出的每条唯一路线,我正在寻找 tblCarrierRates 提供的 1 "best"。
"the best" 的标准是最低的 Tier,其次是最低的 Rate。
结果需要returntblCarrierRates中显示的所有字段,因此基于tblRoutes中显示的2条路线],期望的结果是:
| fCity | fState | tCity | tState | Tier | Rate | CarrID | CarrName |
|========|========|========|========|======|======|========|=============|
|New York| NY | Miami | CA | 1 | .00| BUMP | Bumpy Rides |
|Houston | TX |New York| NY | 2 | .00| SLOW |Slow Carriers|
我看到的方法是按升序排序 Tier,然后是 Rate,然后是如何匹配 TOP 1 记录对于 fCity、fState、tCity 和 tState 的每个唯一组合:
SELECT A.fCity, A.fState, A.tCity, A.tState, Q.Tier, Q.Rate, Q.CarrID, Q.CarrName
FROM tblRoutes As A LEFT JOIN
(SELECT TOP 1 B.CarrID, B.CarrName, B.fCity, B.fState, B.tCity, B.tState, B.Rate, B.Tier
FROM tblCarrierRates As B
ORDER BY tblCarrierRates.Tier ASC, tblCarrierRates.Rate ASC) As Q
ON (A.tState = Q.tState) AND (A.tCity = Q.tCity) AND (A.fState = Q.fState) AND (A.fCity = Q.fCity);
查询没有失败,但您可能猜到了,我编写的子查询 (Q) 仅 return 一条记录,而不是 tblRoutes 中每条路线的 1 条记录,所以最后的结果是:
| fCity | fState | tCity | tState | Tier | Rate | CarrID | CarrName |
|========|========|========|========|======|======|========|=============|
|New York| NY | Miami | CA | 1 | .00| BUMP | Bumpy Rides |
|Houston | TX |New York| NY | | | | |
...如您所见,休斯顿到纽约没有任何匹配项,因为我的子查询仅 returned 1 个结果而不是每条路线 1 个结果。
我怎样才能达到我想要的结果?
您的内部查询需要按城市和州分组。这将为每个城市州生成 1 个,允许外部连接加入这些字段。
独立调试您的内部查询,直到您看到您希望外部查询工作的结果。首先取出 Top1,以便您可以看到排序和分组工作正常。我会明确地将 ASC DESC 放在您的内部查询中,以便其他人知道您希望顶部的工作方向。
您可以尝试以下查询:-
SELECT fCity, fState, tCity, tState, MIN(Tier), MIN(Rate), CarrID, CarrName
FROM tblCarrierRates
GROUP BY fCity, fState, tCity, tState, CarrID, CarrName;
我相信您正在寻找 Sql 服务器和 Oracle 分析/窗口功能的等效项,例如 ROW_NUMBER() OVER (PARTITION .. ORDER BY)
,例如like so.
虽然这在 MS Access 中没有直接提供,但我相信可以通过应用相关子查询来模拟 MS Access 中的行编号功能,该子查询计算具有相同 "Partition" 的行数(由连接过滤器定义),其中每一行通过计算同一分区中前面行的数量来排名,这些行是 'below' 排序标准:
SELECT A.fCity, A.fState, A.tCity, A.tState, Q.Tier, Q.Rate, Q.CarrID, Q.CarrName, TheRank
FROM tblRoutes As A LEFT JOIN
(
SELECT B.CarrID, B.CarrName, B.fCity, B.fState, B.tCity, B.tState, B.Rate, B.Tier,
(
SELECT COUNT(*) + 1
FROM tblCarrierRates rnk
-- Partition Simulation (JOIN)
WHERE B.fCity = rnk.fCity AND B.fState = rnk.fState
AND B.tCity = rnk.tCity AND B.tState = rnk.tState
-- ORDER BY Simulation
AND (rnk.Tier < B.Tier OR
(rnk.Tier = B.Tier AND rnk.Rate < B.Rate))) AS TheRank
FROM tblCarrierRates As B) As Q
ON (A.tState = Q.tState) AND (A.tCity = Q.tCity)
AND (A.fState = Q.fState) AND (A.fCity = Q.fCity)
-- Now, you just want the top rank in each partition.
WHERE TheRank = 1;
请注意性能 - 将为每一行执行子查询。
此外,如果有联系,则将返回两行。
+1 是从行号 1 开始每个分区(因为其分区中前面的行为零)
编辑,去掉评论
SELECT A.fCity, A.fState, A.tCity, A.tState, Q.Tier, Q.Rate, Q.CarrID, Q.CarrName, TheRank
FROM tblRoutes As A LEFT JOIN
(
SELECT B.CarrID, B.CarrName, B.fCity, B.fState, B.tCity, B.tState, B.Rate, B.Tier,
(
SELECT COUNT(*) + 1
FROM tblCarrierRates rnk
WHERE B.fCity = rnk.fCity AND B.fState = rnk.fState
AND B.tCity = rnk.tCity AND B.tState = rnk.tState
AND (rnk.Tier < B.Tier OR
(rnk.Tier = B.Tier AND rnk.Rate < B.Rate))) AS TheRank
FROM tblCarrierRates As B) As Q
ON (A.tState = Q.tState) AND (A.tCity = Q.tCity)
AND (A.fState = Q.fState) AND (A.fCity = Q.fCity)
WHERE TheRank = 1
我有一个名为 tblRoutes 的 table,它包含一个唯一的往返路线列表(f = from;t = to):
| fCity | fState | tCity | tState |
|========|========|========|========|
|New York| NY | Miami | CA |
|Houston | TX |New York| NY |
...
然后我有一个名为 tblCarrierRates 的 table,它列出了承运人为某些路线旅行提供的一系列等级和费率:
| fCity | fState | tCity | tState | Tier | Rate | CarrID | CarrName |
|========|========|========|========|======|======|========|=============|
|New York| NY | Miami | CA | 2 | .99| ABCD | Abracadabra |
|New York| NY | Miami | CA | 1 | .00| BUMP | Bumpy Rides |
|Houston | TX |New York| NY | 2 | .00| SLOW |Slow Carriers|
|Houston | TX |New York| NY | 2 | .01| ABCD | Abracadabra |
...
对于 tblRoutes 中列出的每条唯一路线,我正在寻找 tblCarrierRates 提供的 1 "best"。
"the best" 的标准是最低的 Tier,其次是最低的 Rate。
结果需要returntblCarrierRates中显示的所有字段,因此基于tblRoutes中显示的2条路线],期望的结果是:
| fCity | fState | tCity | tState | Tier | Rate | CarrID | CarrName |
|========|========|========|========|======|======|========|=============|
|New York| NY | Miami | CA | 1 | .00| BUMP | Bumpy Rides |
|Houston | TX |New York| NY | 2 | .00| SLOW |Slow Carriers|
我看到的方法是按升序排序 Tier,然后是 Rate,然后是如何匹配 TOP 1 记录对于 fCity、fState、tCity 和 tState 的每个唯一组合:
SELECT A.fCity, A.fState, A.tCity, A.tState, Q.Tier, Q.Rate, Q.CarrID, Q.CarrName
FROM tblRoutes As A LEFT JOIN
(SELECT TOP 1 B.CarrID, B.CarrName, B.fCity, B.fState, B.tCity, B.tState, B.Rate, B.Tier
FROM tblCarrierRates As B
ORDER BY tblCarrierRates.Tier ASC, tblCarrierRates.Rate ASC) As Q
ON (A.tState = Q.tState) AND (A.tCity = Q.tCity) AND (A.fState = Q.fState) AND (A.fCity = Q.fCity);
查询没有失败,但您可能猜到了,我编写的子查询 (Q) 仅 return 一条记录,而不是 tblRoutes 中每条路线的 1 条记录,所以最后的结果是:
| fCity | fState | tCity | tState | Tier | Rate | CarrID | CarrName |
|========|========|========|========|======|======|========|=============|
|New York| NY | Miami | CA | 1 | .00| BUMP | Bumpy Rides |
|Houston | TX |New York| NY | | | | |
...如您所见,休斯顿到纽约没有任何匹配项,因为我的子查询仅 returned 1 个结果而不是每条路线 1 个结果。
我怎样才能达到我想要的结果?
您的内部查询需要按城市和州分组。这将为每个城市州生成 1 个,允许外部连接加入这些字段。
独立调试您的内部查询,直到您看到您希望外部查询工作的结果。首先取出 Top1,以便您可以看到排序和分组工作正常。我会明确地将 ASC DESC 放在您的内部查询中,以便其他人知道您希望顶部的工作方向。
您可以尝试以下查询:-
SELECT fCity, fState, tCity, tState, MIN(Tier), MIN(Rate), CarrID, CarrName
FROM tblCarrierRates
GROUP BY fCity, fState, tCity, tState, CarrID, CarrName;
我相信您正在寻找 Sql 服务器和 Oracle 分析/窗口功能的等效项,例如 ROW_NUMBER() OVER (PARTITION .. ORDER BY)
,例如like so.
虽然这在 MS Access 中没有直接提供,但我相信可以通过应用相关子查询来模拟 MS Access 中的行编号功能,该子查询计算具有相同 "Partition" 的行数(由连接过滤器定义),其中每一行通过计算同一分区中前面行的数量来排名,这些行是 'below' 排序标准:
SELECT A.fCity, A.fState, A.tCity, A.tState, Q.Tier, Q.Rate, Q.CarrID, Q.CarrName, TheRank
FROM tblRoutes As A LEFT JOIN
(
SELECT B.CarrID, B.CarrName, B.fCity, B.fState, B.tCity, B.tState, B.Rate, B.Tier,
(
SELECT COUNT(*) + 1
FROM tblCarrierRates rnk
-- Partition Simulation (JOIN)
WHERE B.fCity = rnk.fCity AND B.fState = rnk.fState
AND B.tCity = rnk.tCity AND B.tState = rnk.tState
-- ORDER BY Simulation
AND (rnk.Tier < B.Tier OR
(rnk.Tier = B.Tier AND rnk.Rate < B.Rate))) AS TheRank
FROM tblCarrierRates As B) As Q
ON (A.tState = Q.tState) AND (A.tCity = Q.tCity)
AND (A.fState = Q.fState) AND (A.fCity = Q.fCity)
-- Now, you just want the top rank in each partition.
WHERE TheRank = 1;
请注意性能 - 将为每一行执行子查询。 此外,如果有联系,则将返回两行。
+1 是从行号 1 开始每个分区(因为其分区中前面的行为零)
编辑,去掉评论
SELECT A.fCity, A.fState, A.tCity, A.tState, Q.Tier, Q.Rate, Q.CarrID, Q.CarrName, TheRank
FROM tblRoutes As A LEFT JOIN
(
SELECT B.CarrID, B.CarrName, B.fCity, B.fState, B.tCity, B.tState, B.Rate, B.Tier,
(
SELECT COUNT(*) + 1
FROM tblCarrierRates rnk
WHERE B.fCity = rnk.fCity AND B.fState = rnk.fState
AND B.tCity = rnk.tCity AND B.tState = rnk.tState
AND (rnk.Tier < B.Tier OR
(rnk.Tier = B.Tier AND rnk.Rate < B.Rate))) AS TheRank
FROM tblCarrierRates As B) As Q
ON (A.tState = Q.tState) AND (A.tCity = Q.tCity)
AND (A.fState = Q.fState) AND (A.fCity = Q.fCity)
WHERE TheRank = 1