SQL 用于分桶计数

SQL for bucketing counts

我正在尝试使用 StackExchange 数据浏览器创建在网站上提问的人的声誉直方图。

错误信息如下:

Each GROUP BY expression must contain at least one column that 
is not an outer reference.
Invalid column name 'lt_100'. ...

感谢建议

select
  case when Reputation < 100    then "lt_100"
       when Reputation >= 100 and Reputation < 200   then "100_199"
       when Reputation >= 200 and Reputation < 300   then "200_299"
       when Reputation >= 300 and Reputation < 400   then "300_399"
       when Reputation >= 400 and Reputation < 500   then "400_499"
       when Reputation >= 500 and Reputation < 600   then "500_599"
       when Reputation >= 600 and Reputation < 700   then "600_699"
       when Reputation >= 700 and Reputation < 800   then "700_799"
       when Reputation >= 800 and Reputation < 900   then "800_899"
       when Reputation >= 900 and Reputation < 1000  then "900_999"
       else "over 1000"
  end  ReputationRange,
  count(*) as TotalWithinRange
FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId 
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010'
Group by 
1

您应该使用 single-quotes 作为分类范围。如果您使用 " " 它将被视为列名。您还应该在 group by 子句中包含计算。

Demo

select
  case when Reputation < 100    then 'lt_100'
       when Reputation >= 100 and Reputation < 200   then '100_199'
       when Reputation >= 200 and Reputation < 300   then '200_299'
       when Reputation >= 300 and Reputation < 400   then '300_399'
       when Reputation >= 400 and Reputation < 500   then '400_499'
       when Reputation >= 500 and Reputation < 600   then '500_599'
       when Reputation >= 600 and Reputation < 700   then '600_699'
       when Reputation >= 700 and Reputation < 800   then '700_799'
       when Reputation >= 800 and Reputation < 900   then '800_899'
       when Reputation >= 900 and Reputation < 1000  then '900_999'
       else 'over 1000'
  end ReputationRange,
  count(*) as TotalWithinRange
FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId 
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010'
Group by 
case when Reputation < 100    then 'lt_100'
       when Reputation >= 100 and Reputation < 200   then '100_199'
       when Reputation >= 200 and Reputation < 300   then '200_299'
       when Reputation >= 300 and Reputation < 400   then '300_399'
       when Reputation >= 400 and Reputation < 500   then '400_499'
       when Reputation >= 500 and Reputation < 600   then '500_599'
       when Reputation >= 600 and Reputation < 700   then '600_699'
       when Reputation >= 700 and Reputation < 800   then '700_799'
       when Reputation >= 800 and Reputation < 900   then '800_899'
       when Reputation >= 900 and Reputation < 1000  then '900_999'
       else 'over 1000'
end

不幸的是,您不能像在 order by 中那样使用“1”来为组别名。 - 为了避免在你的组中重复 case 语句,你可以利用 SQL:

中的 'with' 子句
with data as (
select
  case when Reputation < 100    then 'lt_100'
       when Reputation >= 100 and Reputation < 200   then '100_199'
       when Reputation >= 200 and Reputation < 300   then '200_299'
       when Reputation >= 300 and Reputation < 400   then '300_399'
       when Reputation >= 400 and Reputation < 500   then '400_499'
       when Reputation >= 500 and Reputation < 600   then '500_599'
       when Reputation >= 600 and Reputation < 700   then '600_699'
       when Reputation >= 700 and Reputation < 800   then '700_799'
       when Reputation >= 800 and Reputation < 900   then '800_899'
       when Reputation >= 900 and Reputation < 1000  then '900_999'
       else 'over 1000'
       end as ReputationRange FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId 
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010')
select  ReputationRange, count(*) as TotalWithinRange
from data
Group by ReputationRange

Working Demo/Example