SQL 用于分桶计数
SQL for bucketing counts
我正在尝试使用 StackExchange 数据浏览器创建在网站上提问的人的声誉直方图。
错误信息如下:
Each GROUP BY expression must contain at least one column that
is not an outer reference.
Invalid column name 'lt_100'. ...
感谢建议
select
case when Reputation < 100 then "lt_100"
when Reputation >= 100 and Reputation < 200 then "100_199"
when Reputation >= 200 and Reputation < 300 then "200_299"
when Reputation >= 300 and Reputation < 400 then "300_399"
when Reputation >= 400 and Reputation < 500 then "400_499"
when Reputation >= 500 and Reputation < 600 then "500_599"
when Reputation >= 600 and Reputation < 700 then "600_699"
when Reputation >= 700 and Reputation < 800 then "700_799"
when Reputation >= 800 and Reputation < 900 then "800_899"
when Reputation >= 900 and Reputation < 1000 then "900_999"
else "over 1000"
end ReputationRange,
count(*) as TotalWithinRange
FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010'
Group by
1
您应该使用 single-quotes
作为分类范围。如果您使用 " "
它将被视为列名。您还应该在 group by
子句中包含计算。
select
case when Reputation < 100 then 'lt_100'
when Reputation >= 100 and Reputation < 200 then '100_199'
when Reputation >= 200 and Reputation < 300 then '200_299'
when Reputation >= 300 and Reputation < 400 then '300_399'
when Reputation >= 400 and Reputation < 500 then '400_499'
when Reputation >= 500 and Reputation < 600 then '500_599'
when Reputation >= 600 and Reputation < 700 then '600_699'
when Reputation >= 700 and Reputation < 800 then '700_799'
when Reputation >= 800 and Reputation < 900 then '800_899'
when Reputation >= 900 and Reputation < 1000 then '900_999'
else 'over 1000'
end ReputationRange,
count(*) as TotalWithinRange
FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010'
Group by
case when Reputation < 100 then 'lt_100'
when Reputation >= 100 and Reputation < 200 then '100_199'
when Reputation >= 200 and Reputation < 300 then '200_299'
when Reputation >= 300 and Reputation < 400 then '300_399'
when Reputation >= 400 and Reputation < 500 then '400_499'
when Reputation >= 500 and Reputation < 600 then '500_599'
when Reputation >= 600 and Reputation < 700 then '600_699'
when Reputation >= 700 and Reputation < 800 then '700_799'
when Reputation >= 800 and Reputation < 900 then '800_899'
when Reputation >= 900 and Reputation < 1000 then '900_999'
else 'over 1000'
end
不幸的是,您不能像在 order by 中那样使用“1”来为组别名。 - 为了避免在你的组中重复 case 语句,你可以利用 SQL:
中的 'with' 子句
with data as (
select
case when Reputation < 100 then 'lt_100'
when Reputation >= 100 and Reputation < 200 then '100_199'
when Reputation >= 200 and Reputation < 300 then '200_299'
when Reputation >= 300 and Reputation < 400 then '300_399'
when Reputation >= 400 and Reputation < 500 then '400_499'
when Reputation >= 500 and Reputation < 600 then '500_599'
when Reputation >= 600 and Reputation < 700 then '600_699'
when Reputation >= 700 and Reputation < 800 then '700_799'
when Reputation >= 800 and Reputation < 900 then '800_899'
when Reputation >= 900 and Reputation < 1000 then '900_999'
else 'over 1000'
end as ReputationRange FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010')
select ReputationRange, count(*) as TotalWithinRange
from data
Group by ReputationRange
我正在尝试使用 StackExchange 数据浏览器创建在网站上提问的人的声誉直方图。
错误信息如下:
Each GROUP BY expression must contain at least one column that
is not an outer reference.
Invalid column name 'lt_100'. ...
感谢建议
select
case when Reputation < 100 then "lt_100"
when Reputation >= 100 and Reputation < 200 then "100_199"
when Reputation >= 200 and Reputation < 300 then "200_299"
when Reputation >= 300 and Reputation < 400 then "300_399"
when Reputation >= 400 and Reputation < 500 then "400_499"
when Reputation >= 500 and Reputation < 600 then "500_599"
when Reputation >= 600 and Reputation < 700 then "600_699"
when Reputation >= 700 and Reputation < 800 then "700_799"
when Reputation >= 800 and Reputation < 900 then "800_899"
when Reputation >= 900 and Reputation < 1000 then "900_999"
else "over 1000"
end ReputationRange,
count(*) as TotalWithinRange
FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010'
Group by
1
您应该使用 single-quotes
作为分类范围。如果您使用 " "
它将被视为列名。您还应该在 group by
子句中包含计算。
select
case when Reputation < 100 then 'lt_100'
when Reputation >= 100 and Reputation < 200 then '100_199'
when Reputation >= 200 and Reputation < 300 then '200_299'
when Reputation >= 300 and Reputation < 400 then '300_399'
when Reputation >= 400 and Reputation < 500 then '400_499'
when Reputation >= 500 and Reputation < 600 then '500_599'
when Reputation >= 600 and Reputation < 700 then '600_699'
when Reputation >= 700 and Reputation < 800 then '700_799'
when Reputation >= 800 and Reputation < 900 then '800_899'
when Reputation >= 900 and Reputation < 1000 then '900_999'
else 'over 1000'
end ReputationRange,
count(*) as TotalWithinRange
FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010'
Group by
case when Reputation < 100 then 'lt_100'
when Reputation >= 100 and Reputation < 200 then '100_199'
when Reputation >= 200 and Reputation < 300 then '200_299'
when Reputation >= 300 and Reputation < 400 then '300_399'
when Reputation >= 400 and Reputation < 500 then '400_499'
when Reputation >= 500 and Reputation < 600 then '500_599'
when Reputation >= 600 and Reputation < 700 then '600_699'
when Reputation >= 700 and Reputation < 800 then '700_799'
when Reputation >= 800 and Reputation < 900 then '800_899'
when Reputation >= 900 and Reputation < 1000 then '900_999'
else 'over 1000'
end
不幸的是,您不能像在 order by 中那样使用“1”来为组别名。 - 为了避免在你的组中重复 case 语句,你可以利用 SQL:
中的 'with' 子句with data as (
select
case when Reputation < 100 then 'lt_100'
when Reputation >= 100 and Reputation < 200 then '100_199'
when Reputation >= 200 and Reputation < 300 then '200_299'
when Reputation >= 300 and Reputation < 400 then '300_399'
when Reputation >= 400 and Reputation < 500 then '400_499'
when Reputation >= 500 and Reputation < 600 then '500_599'
when Reputation >= 600 and Reputation < 700 then '600_699'
when Reputation >= 700 and Reputation < 800 then '700_799'
when Reputation >= 800 and Reputation < 900 then '800_899'
when Reputation >= 900 and Reputation < 1000 then '900_999'
else 'over 1000'
end as ReputationRange FROM Users
JOIN Posts ON Users.Id = Posts.OwnerUserId
JOIN PostTags ON PostTags.PostId = Posts.Id
JOIN Tags on Tags.Id = PostTags.TagId
WHERE PostTypeId = 1 and Posts.CreationDate > '9/1/2010')
select ReputationRange, count(*) as TotalWithinRange
from data
Group by ReputationRange