如何在 Hive SQL 中对一列中的数据进行分组并将其分布在另一列中?
How to grouby data in one column and distribute it in another column in HiveSQL?
我有以下数据:
CompanyID
Department
No of People
Country
45390
HR
100
UK
45390
Service
250
UK
98712
Service
300
US
39284
Admin
142
Norway
85932
Admin
260
Germany
我想知道来自不同国家的同一部门有多少人?
需要输出
Department
No of People
Country
HR
100
UK
Service
250
UK
300
US
Admin
142
Norway
260
Germany
我能够获取数据,但该查询重复了该部门。
""" select Department, Country,count(Department) from dataset
group by Country,Department
order by Department """
如何获得所需的输出?
您生成的结果集并不是真正的关系结果集。为什么?因为行取决于“前一”行中的内容。而在关系数据库中,没有“上一个”行这样的东西。这种处理往往在应用层处理。
当然,SQL可以为所欲为。你只需要小心:
select (case when 1 = row_number() over (partition by Department order by Country)
then Department
end) as Department,
Country, count(*) as num_people,
from dataset
group by Country,Department
order by Department, Country;
请注意,order by
需要匹配 window 函数子句,以确保 row_number()
认为是第一行的内容确实是结果集中的第一行。
我有以下数据:
CompanyID | Department | No of People | Country |
---|---|---|---|
45390 | HR | 100 | UK |
45390 | Service | 250 | UK |
98712 | Service | 300 | US |
39284 | Admin | 142 | Norway |
85932 | Admin | 260 | Germany |
我想知道来自不同国家的同一部门有多少人?
需要输出
Department | No of People | Country |
---|---|---|
HR | 100 | UK |
Service | 250 | UK |
300 | US | |
Admin | 142 | Norway |
260 | Germany |
我能够获取数据,但该查询重复了该部门。
""" select Department, Country,count(Department) from dataset
group by Country,Department
order by Department """
如何获得所需的输出?
您生成的结果集并不是真正的关系结果集。为什么?因为行取决于“前一”行中的内容。而在关系数据库中,没有“上一个”行这样的东西。这种处理往往在应用层处理。
当然,SQL可以为所欲为。你只需要小心:
select (case when 1 = row_number() over (partition by Department order by Country)
then Department
end) as Department,
Country, count(*) as num_people,
from dataset
group by Country,Department
order by Department, Country;
请注意,order by
需要匹配 window 函数子句,以确保 row_number()
认为是第一行的内容确实是结果集中的第一行。