SQL 在同一数据集上按不同级别分组

Question

我有以下数据集，希望创建不同的组来统计name下的值出现的次数。

有：（县在字符串中）

name   state  county 
apple   MD      1
apple   DC      1
pear    VA      1
pear    VA      2
pear    CA      5
peach   CO      3
peach   CO      3
peach   CO      2
peach   CO      2

想要：

name   state  county freq_name  freq_state  freq_county
apple   MD      1     2            1            2
apple   DC      1     2            1            2
pear    VA      1     3            2            3
pear    VA      2     3            2            3
pear    CA      5     3            1            3
peach   CO      3     4            4            2
peach   CO      2     4            4            2

我相信 SQL ，分区将允许不同级别的计数类似于：

count(name) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;

出于某种原因，这段代码没有为我提供 freq_name 的正确计数。我还想检查我的 freq_state 和 freq_county 代码是否准确。谢谢！

Answer 1

对于 freq_name，使用 count(*) 而不是 count(name)

count(*) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;

Answer 2

您似乎想要：

select name, state, county, count(*) as this_count,
       sum(count(*)) over (partition by name) as freq_name,
       sum(count(*)) over (partition by state) as freq_state,
       sum(count(*)) as freq_county
from have
group by name, state, county;

SQL 在同一数据集上按不同级别分组

SQL group by different levels on the same dataset

sql

sql-server

hadoop

hive

cloudera