为什么 SQL 不允许多个子查询使用 WITH 子句?

Why does SQL not allow the WITH clause for multiple subqueries?

为什么 SQL 只允许嵌套子查询?

例如,拿这个问题

Table 名称是 ratings,列

  1. user_id
  2. 职业
  3. 评分

在 Postgres 或 Bigquery 中, 我愿意

with ratings_by_user as (
select occupation, user_id, count(*) num_ratings
from ratings
group by 1,2
),

max_ratings_by_occupation as (
select occupation, max(num_ratings) as max_ratings
from ratings_by_user
group by 1
),

select occupation, user_id
from ratings_by_user
inner join max_ratings_by_occupation
using (occupation)
where num_ratings = max_ratings

但我不确定如何在 SQL 中执行此操作,我需要将所有子查询嵌套在一个块中。 这是我在 SQL 中的尝试,但它不起作用。

select occupation, user_id, count(*) as num_ratings
from ( 
    select occupation, max(num_ratings) max_ratings 
    from ( 
        select occupation, user_id, count(*) num_ratings
        from users
        group by 1,2
        ) as ratings_table
    group by 1
    ) as max_ratings_table
)
inner join ratings on ratings.occupation = max_ratings_table.occupation
where max_ratings = num_ratings

任何人都可以告诉我如何在 SQL 中使用相同样式的 Postgres / Bigquery 我希望按顺序处理我的子查询吗?我只是发现很难一次性解决复杂的问题。

非常感谢您抽出宝贵时间。

Per https://dev.mysql.com/doc/refman/8.0/en/with.html and MySQL "WITH" clause - WITH 仅在 MySQL 8+ 上受支持。确保您使用的是适当版本的 MySQL

转换为嵌套版本并不难。我们把你的工作 sql:

with ratings_by_user as (
select occupation, user_id, count(*) num_ratings
from ratings
group by 1,2
),

max_ratings_by_occupation as (
select occupation, max(num_ratings) as max_ratings
from ratings_by_user
group by 1
),

select occupation, user_id
from ratings_by_user
inner join max_ratings_by_occupation
using (occupation)
where num_ratings = max_ratings

我们复制所有内容,包括 WITH 的括号,并在使用别名之前粘贴它。

第 1 步,剪切 ratings_by_user 并将其粘贴到任何使用 ratings_by_user 的地方(两次)



--cut from here
with ratings_by_user as ,

max_ratings_by_occupation as (
select occupation, max(num_ratings) as max_ratings
from 
  --paste to here
  (
    select occupation, user_id, count(*) num_ratings
    from ratings
    group by 1,2
  ) ratings_by_user
group by 1
),

select occupation, user_id
from
--and also paste to here 
(
  select occupation, user_id, count(*) num_ratings
  from ratings
  group by 1,2
) ratings_by_user
inner join max_ratings_by_occupation
using (occupation)
where num_ratings = max_ratings

第 2 步,将 max_ratings_by_occupation 剪切并粘贴到使用位置:

with ratings_by_user as ,

--cut from here
max_ratings_by_occupation as ,

select occupation, user_id
from
(
  select occupation, user_id, count(*) num_ratings
  from ratings
  group by 1,2
) ratings_by_user
inner join 

--paste to here
(
  select occupation, max(num_ratings) as max_ratings
  from 
  (
    select occupation, user_id, count(*) num_ratings
    from ratings
    group by 1,2
  ) ratings_by_user
  group by 1
) max_ratings_by_occupation

using (occupation)
where num_ratings = max_ratings

第3步,清理空withs


select occupation, user_id
from
(
  select occupation, user_id, count(*) num_ratings
  from ratings
  group by 1,2
) ratings_by_user
inner join 
(
  select occupation, max(num_ratings) as max_ratings
  from 
  (
    select occupation, user_id, count(*) num_ratings
    from ratings
    group by 1,2
  ) ratings_by_user
  group by 1
) max_ratings_by_occupation

using (occupation)
where num_ratings = max_ratings

这将是 optimizing/rewriting 的开始。棘手的部分是它使用 ratings_by_user 两次,因此在步骤 1 中需要两次粘贴

您的重新格式化尝试没有成功,因为您试图在外部级别使用仅存在于内部级别的结果集:

select occupation, user_id, count(*) as num_ratings
from 

( --max_ratings_table available inside these brackets
    select occupation, max(num_ratings) max_ratings 
    from ( 
        select occupation, user_id, count(*) num_ratings
        from users
        group by 1,2
        ) as ratings_table
    group by 1
    ) as max_ratings_table
  --end of max_ratings_table availability
) 

inner join ratings on ratings.occupation = max_ratings_table.occupation
--                                         ^^^^^^^^^^^^^^^^^
--                                       mrt not available here
where max_ratings = num_ratings