Impala:在执行group by的聚合函数之前更改列类型

Impala: change the column type prior to perform the aggregation function for group by

我有一个table,my_table:

transaction_id    |   money     |  team
--------------------------------------------
    1             |   10        |   A
    2             |   20        |   B
    3             |   null      |   A
    4             |   30        |   A
    5             |   16        |   B
    6             |   12        |   B

当我按团队分组时,我可以通过查询计算最大,最小值:

select team, max(money), min(money) from my_table group by team

但是,我不能做 avg 和 sum 因为有 null。即:

select team, avg(money), sum(money) from my_table group by team

会失败。

有没有办法在计算平均值和总和之前更改列类型?即我希望输出为:

team   |  avg(money)   |  sum(money)
--------------------------------------
 A     |  20           |  40
 B     |  16           |  48

谢谢!

根据 Cloudera 提供的文档,您的查询应该按原样工作。 AVG FunctionSUM Function 忽略 null。

SELECT team, AVG(money), SUM(money)
FROM my_table
GROUP BY team

更新:根据您的评论,我再次不熟悉 Impala。大概标准 SQL 会起作用。您的错误似乎是数据类型问题。

SELECT team, AVG(CAST(money AS INT)), SUM(CAST(money AS INT))
FROM my_table
GROUP BY team

只需将总和除以计数即可:

SELECT team, SUM(money)/COUNT(money) AS AVG, SUM(money)
FROM team
GROUP BY team

此处测试:http://sqlfiddle.com/#!9/ba381/4