SQL中的WITH语句重写为子查询语句?
Rewriting WITH statements into subquery statements in SQL?
我有以下两个关系:
Game(id, name, year)
Devs(pid, gid, role)
其中 Game.id 是主键,Devs.gid 是 Game.id 的外键。
在 中,另一位用户好心地帮助我创建了一个查询,以查找由大多数开发该游戏的开发人员制作的所有游戏。他的回答使用了 WITH 语句,我对这些不是很熟悉,因为我才学了几周 SQL。这是工作查询:
WITH GamesDevs (GameName, DevsCount)
AS
(
SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game, Devs
WHERE Devs.gid=Game.id
GROUP BY Devs.gid, Game.name
)
SELECT * FROM GamesDevs WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM GamesDevs)
为了更加熟悉 SQL,我尝试使用子查询而不是 WITH 语句重写此查询。我一直在使用 this Oracle documentation 来帮助我弄清楚。我尝试像这样重写查询:
SELECT *
FROM (SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game, Devs
WHERE Devs.gid=Game.id
GROUP BY Devs.gid, Game.name) GamesDevs
WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM GamesDevs)
据我所知,这两个查询应该是相同的。但是,当我尝试 运行 第二个查询时,出现错误
Msg 207, Level 16, State 1, Line 6 Invalid column name 'DevsCount'.
有谁知道为什么我会收到此错误,或者为什么这两个查询不相同?
问题出在这一行:
WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM GamesDevs)
来自 CTE 你 select 两次。但是你不能从子查询。
第一条语句
WHERE GamesDevs.DevsCount
是正确的。但是
(SELECT MAX(DevsCount) FROM GamesDevs)
不正确,因为您可以重复使用子查询。从像视图一样起作用的 cte 中进行选择,这就是为什么您可以同时使用最大值和比较计数的原因
您需要在最后的 from 子句中复制该子查询,例如:
SELECT *
FROM (SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game, Devs
WHERE Devs.gid=Game.id
GROUP BY Devs.gid, Game.name) GamesDevs
WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM (SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game
INNER JOIN Devs ON Devs.gid=Game.id
GROUP BY Devs.gid, Game.name))
但最好这样做:
SELECT TOP 1 WITH TIES Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game
INNER JOIN Devs ON Devs.gid=Game.id
GROUP BY Devs.gid, Game.name
ORDER BY DevsCount DESC
您可以在子查询中使用 RANK
window 函数来查找每个游戏的不同开发者数量最多的记录:
SELECT GameName, DevsCount
FROM (
SELECT Game.name AS GameName,
COUNT(DISTINCT Devs.pid) AS DevsCount,
RANK() OVER (ORDER BY COUNT(DISTINCT Devs.pid) DESC) AS rnk
FROM Game
INNER JOIN Devs ON Game.id = Devs.gid
GROUP BY Devs.gid, Game.name ) t
WHERE t.rnk = 1
这样您就不必重复 SELECT MAX(DevsCount)
查询。您只需 select 那些具有 rnk = 1
.
的记录
您还应该考虑使用 ANSI-standard SQL 来指定您正在使用的 JOIN
操作的确切类型。
@Giorgi 建议使用 SELECT TOP 1 WITH TIES
的回答是对您的原始问题的更清晰的解决方案。我只是将此 post 作为更正您正在使用的子查询的替代方法。
我有以下两个关系:
Game(id, name, year)
Devs(pid, gid, role)
其中 Game.id 是主键,Devs.gid 是 Game.id 的外键。
在
WITH GamesDevs (GameName, DevsCount)
AS
(
SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game, Devs
WHERE Devs.gid=Game.id
GROUP BY Devs.gid, Game.name
)
SELECT * FROM GamesDevs WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM GamesDevs)
为了更加熟悉 SQL,我尝试使用子查询而不是 WITH 语句重写此查询。我一直在使用 this Oracle documentation 来帮助我弄清楚。我尝试像这样重写查询:
SELECT *
FROM (SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game, Devs
WHERE Devs.gid=Game.id
GROUP BY Devs.gid, Game.name) GamesDevs
WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM GamesDevs)
据我所知,这两个查询应该是相同的。但是,当我尝试 运行 第二个查询时,出现错误
Msg 207, Level 16, State 1, Line 6 Invalid column name 'DevsCount'.
有谁知道为什么我会收到此错误,或者为什么这两个查询不相同?
问题出在这一行:
WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM GamesDevs)
来自 CTE 你 select 两次。但是你不能从子查询。 第一条语句
WHERE GamesDevs.DevsCount
是正确的。但是
(SELECT MAX(DevsCount) FROM GamesDevs)
不正确,因为您可以重复使用子查询。从像视图一样起作用的 cte 中进行选择,这就是为什么您可以同时使用最大值和比较计数的原因
您需要在最后的 from 子句中复制该子查询,例如:
SELECT *
FROM (SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game, Devs
WHERE Devs.gid=Game.id
GROUP BY Devs.gid, Game.name) GamesDevs
WHERE GamesDevs.DevsCount = (SELECT MAX(DevsCount) FROM (SELECT Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game
INNER JOIN Devs ON Devs.gid=Game.id
GROUP BY Devs.gid, Game.name))
但最好这样做:
SELECT TOP 1 WITH TIES Game.name AS GameName, count(DISTINCT Devs.pid) AS DevsCount
FROM Game
INNER JOIN Devs ON Devs.gid=Game.id
GROUP BY Devs.gid, Game.name
ORDER BY DevsCount DESC
您可以在子查询中使用 RANK
window 函数来查找每个游戏的不同开发者数量最多的记录:
SELECT GameName, DevsCount
FROM (
SELECT Game.name AS GameName,
COUNT(DISTINCT Devs.pid) AS DevsCount,
RANK() OVER (ORDER BY COUNT(DISTINCT Devs.pid) DESC) AS rnk
FROM Game
INNER JOIN Devs ON Game.id = Devs.gid
GROUP BY Devs.gid, Game.name ) t
WHERE t.rnk = 1
这样您就不必重复 SELECT MAX(DevsCount)
查询。您只需 select 那些具有 rnk = 1
.
您还应该考虑使用 ANSI-standard SQL 来指定您正在使用的 JOIN
操作的确切类型。
@Giorgi 建议使用 SELECT TOP 1 WITH TIES
的回答是对您的原始问题的更清晰的解决方案。我只是将此 post 作为更正您正在使用的子查询的替代方法。