不存在非聚合时聚合列出现问题
Issue with aggregate columns when non-aggregates are not present
当我 运行 以下查询时,我遇到了 Amazon Redshift 聚合错误的问题:
select case when frequency between (avg(frequency) + stddev(frequency)) and (avg(frequency) - stddev(frequency)) then round(avg(frequency) - stddev(frequency))||'-'||round(avg(frequency) + stddev(frequency))
when frequency between (avg(frequency) + 2*stddev(frequency)) and (avg(frequency) - 2*stddev(frequency)) then round(avg(frequency) - 2*stddev(frequency))||'-'||round(avg(frequency) + 2*stddev(frequency))
when frequency between (avg(frequency) + 3*stddev(frequency)) and (avg(frequency) - 3*stddev(frequency)) then round(avg(frequency) - 3*stddev(frequency))||'-'||round(avg(frequency) + 3*stddev(frequency))
else null
end as deviation
from schema.table
;
错误提示我需要在 group by 子句中包含频率。如果我这样做,那么我会收到 "aggregates not allowed in group by"。有谁知道为什么会这样?我最初的猜测是这可能是数据类型的问题,但弄乱了这个并没有帮助。
谢谢!
这些查询可能会造成混淆,您可以在 sub-query 中单独获取聚合,然后通过 cross-join 在每一行上使用它们,或者您可以使用分析函数,这样您就可以获取没有 GROUP BY
:
的聚合值
SELECT case when frequency between (avg_Freq + dev_Freq) and (avg_Freq - dev_Freq) then round(avg_Freq - dev_Freq)||'-'||round(avg_Freq + dev_Freq)
when frequency between (avg_Freq + 2*dev_Freq) and (avg_Freq - 2*dev_Freq) then round(avg_Freq - 2*dev_Freq)||'-'||round(avg_Freq + 2*dev_Freq)
when frequency between (avg_Freq + 3*dev_Freq) and (avg_Freq - 3*dev_Freq) then round(avg_Freq - 3*dev_Freq)||'-'||round(avg_Freq + 3*dev_Freq)
else null
end as deviation
FROM schema.table
CROSS JOIN (SELECT avg(frequency) AS avg_Freq
,stddev(frequency) AS dev_Freq
FROM schema.table
)sub
或者,您可以将 OVER()
添加到现有查询中的每个聚合:
select case when frequency between (avg(frequency) OVER() + stddev(frequency) OVER()) and (avg(frequency) OVER() - stddev(frequency) OVER()) then round(avg(frequency) OVER() - stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + stddev(frequency) OVER())
when frequency between (avg(frequency) OVER() + 2*stddev(frequency) OVER()) and (avg(frequency) OVER() - 2*stddev(frequency) OVER()) then round(avg(frequency) OVER() - 2*stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + 2*stddev(frequency) OVER())
when frequency between (avg(frequency) OVER() + 3*stddev(frequency) OVER()) and (avg(frequency) OVER() - 3*stddev(frequency) OVER()) then round(avg(frequency) OVER() - 3*stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + 3*stddev(frequency) OVER())
else null
end as deviation
from schema.table
不是 100% 使用 redshift 语法,但相信两者都应该有效。
您可以通过以下方式将其分解:
WITH
SELECT avg(frequency) as AVG, stddev(frequency) as STDDEV
from schema.table AS TEMP
,
SELECT case when frequency between TEMP.AVG and TEMP.STDDEV etc.
您必须检查确切的陈述。我是用脑子做的。
当我 运行 以下查询时,我遇到了 Amazon Redshift 聚合错误的问题:
select case when frequency between (avg(frequency) + stddev(frequency)) and (avg(frequency) - stddev(frequency)) then round(avg(frequency) - stddev(frequency))||'-'||round(avg(frequency) + stddev(frequency))
when frequency between (avg(frequency) + 2*stddev(frequency)) and (avg(frequency) - 2*stddev(frequency)) then round(avg(frequency) - 2*stddev(frequency))||'-'||round(avg(frequency) + 2*stddev(frequency))
when frequency between (avg(frequency) + 3*stddev(frequency)) and (avg(frequency) - 3*stddev(frequency)) then round(avg(frequency) - 3*stddev(frequency))||'-'||round(avg(frequency) + 3*stddev(frequency))
else null
end as deviation
from schema.table
;
错误提示我需要在 group by 子句中包含频率。如果我这样做,那么我会收到 "aggregates not allowed in group by"。有谁知道为什么会这样?我最初的猜测是这可能是数据类型的问题,但弄乱了这个并没有帮助。
谢谢!
这些查询可能会造成混淆,您可以在 sub-query 中单独获取聚合,然后通过 cross-join 在每一行上使用它们,或者您可以使用分析函数,这样您就可以获取没有 GROUP BY
:
SELECT case when frequency between (avg_Freq + dev_Freq) and (avg_Freq - dev_Freq) then round(avg_Freq - dev_Freq)||'-'||round(avg_Freq + dev_Freq)
when frequency between (avg_Freq + 2*dev_Freq) and (avg_Freq - 2*dev_Freq) then round(avg_Freq - 2*dev_Freq)||'-'||round(avg_Freq + 2*dev_Freq)
when frequency between (avg_Freq + 3*dev_Freq) and (avg_Freq - 3*dev_Freq) then round(avg_Freq - 3*dev_Freq)||'-'||round(avg_Freq + 3*dev_Freq)
else null
end as deviation
FROM schema.table
CROSS JOIN (SELECT avg(frequency) AS avg_Freq
,stddev(frequency) AS dev_Freq
FROM schema.table
)sub
或者,您可以将 OVER()
添加到现有查询中的每个聚合:
select case when frequency between (avg(frequency) OVER() + stddev(frequency) OVER()) and (avg(frequency) OVER() - stddev(frequency) OVER()) then round(avg(frequency) OVER() - stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + stddev(frequency) OVER())
when frequency between (avg(frequency) OVER() + 2*stddev(frequency) OVER()) and (avg(frequency) OVER() - 2*stddev(frequency) OVER()) then round(avg(frequency) OVER() - 2*stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + 2*stddev(frequency) OVER())
when frequency between (avg(frequency) OVER() + 3*stddev(frequency) OVER()) and (avg(frequency) OVER() - 3*stddev(frequency) OVER()) then round(avg(frequency) OVER() - 3*stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + 3*stddev(frequency) OVER())
else null
end as deviation
from schema.table
不是 100% 使用 redshift 语法,但相信两者都应该有效。
您可以通过以下方式将其分解:
WITH
SELECT avg(frequency) as AVG, stddev(frequency) as STDDEV
from schema.table AS TEMP
,
SELECT case when frequency between TEMP.AVG and TEMP.STDDEV etc.
您必须检查确切的陈述。我是用脑子做的。