Impala:在执行group by的聚合函数之前更改列类型
Impala: change the column type prior to perform the aggregation function for group by
我有一个table,my_table:
transaction_id | money | team
--------------------------------------------
1 | 10 | A
2 | 20 | B
3 | null | A
4 | 30 | A
5 | 16 | B
6 | 12 | B
当我按团队分组时,我可以通过查询计算最大,最小值:
select team, max(money), min(money) from my_table group by team
但是,我不能做 avg 和 sum 因为有 null。即:
select team, avg(money), sum(money) from my_table group by team
会失败。
有没有办法在计算平均值和总和之前更改列类型?即我希望输出为:
team | avg(money) | sum(money)
--------------------------------------
A | 20 | 40
B | 16 | 48
谢谢!
根据 Cloudera 提供的文档,您的查询应该按原样工作。 AVG Function 和
SUM Function 忽略 null。
SELECT team, AVG(money), SUM(money)
FROM my_table
GROUP BY team
更新:根据您的评论,我再次不熟悉 Impala。大概标准 SQL 会起作用。您的错误似乎是数据类型问题。
SELECT team, AVG(CAST(money AS INT)), SUM(CAST(money AS INT))
FROM my_table
GROUP BY team
只需将总和除以计数即可:
SELECT team, SUM(money)/COUNT(money) AS AVG, SUM(money)
FROM team
GROUP BY team
我有一个table,my_table:
transaction_id | money | team
--------------------------------------------
1 | 10 | A
2 | 20 | B
3 | null | A
4 | 30 | A
5 | 16 | B
6 | 12 | B
当我按团队分组时,我可以通过查询计算最大,最小值:
select team, max(money), min(money) from my_table group by team
但是,我不能做 avg 和 sum 因为有 null。即:
select team, avg(money), sum(money) from my_table group by team
会失败。
有没有办法在计算平均值和总和之前更改列类型?即我希望输出为:
team | avg(money) | sum(money)
--------------------------------------
A | 20 | 40
B | 16 | 48
谢谢!
根据 Cloudera 提供的文档,您的查询应该按原样工作。 AVG Function 和 SUM Function 忽略 null。
SELECT team, AVG(money), SUM(money)
FROM my_table
GROUP BY team
更新:根据您的评论,我再次不熟悉 Impala。大概标准 SQL 会起作用。您的错误似乎是数据类型问题。
SELECT team, AVG(CAST(money AS INT)), SUM(CAST(money AS INT))
FROM my_table
GROUP BY team
只需将总和除以计数即可:
SELECT team, SUM(money)/COUNT(money) AS AVG, SUM(money)
FROM team
GROUP BY team