SQL 在同一数据集上按不同级别分组
SQL group by different levels on the same dataset
我有以下数据集,希望创建不同的组来统计name下的值出现的次数。
有:(县在字符串中)
name state county
apple MD 1
apple DC 1
pear VA 1
pear VA 2
pear CA 5
peach CO 3
peach CO 3
peach CO 2
peach CO 2
想要:
name state county freq_name freq_state freq_county
apple MD 1 2 1 2
apple DC 1 2 1 2
pear VA 1 3 2 3
pear VA 2 3 2 3
pear CA 5 3 1 3
peach CO 3 4 4 2
peach CO 2 4 4 2
我相信 SQL ,分区将允许不同级别的计数
类似于:
count(name) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;
出于某种原因,这段代码没有为我提供 freq_name 的正确计数。我还想检查我的 freq_state 和 freq_county 代码是否准确。谢谢!
对于 freq_name
,使用 count(*)
而不是 count(name)
count(*) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;
您似乎想要:
select name, state, county, count(*) as this_count,
sum(count(*)) over (partition by name) as freq_name,
sum(count(*)) over (partition by state) as freq_state,
sum(count(*)) as freq_county
from have
group by name, state, county;
我有以下数据集,希望创建不同的组来统计name下的值出现的次数。
有:(县在字符串中)
name state county
apple MD 1
apple DC 1
pear VA 1
pear VA 2
pear CA 5
peach CO 3
peach CO 3
peach CO 2
peach CO 2
想要:
name state county freq_name freq_state freq_county
apple MD 1 2 1 2
apple DC 1 2 1 2
pear VA 1 3 2 3
pear VA 2 3 2 3
pear CA 5 3 1 3
peach CO 3 4 4 2
peach CO 2 4 4 2
我相信 SQL ,分区将允许不同级别的计数 类似于:
count(name) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;
出于某种原因,这段代码没有为我提供 freq_name 的正确计数。我还想检查我的 freq_state 和 freq_county 代码是否准确。谢谢!
对于 freq_name
,使用 count(*)
而不是 count(name)
count(*) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;
您似乎想要:
select name, state, county, count(*) as this_count,
sum(count(*)) over (partition by name) as freq_name,
sum(count(*)) over (partition by state) as freq_state,
sum(count(*)) as freq_county
from have
group by name, state, county;