如何在 Presto 的 SQL (Athena) 中计算从一年开始的每个月的平均值?
How to calculate average for every month from start from year in Presto's SQL (Athena)?
下面是我拥有的 table 数据的示例
| date | value |
| 2020-01-01 | 20 |
| 2020-01-14 | 10 |
| 2020-02-02 | 30 |
| 2020-02-11 | 25 |
| 2020-02-25 | 25 |
| 2020-03-13 | 34 |
| 2020-03-21 | 10 |
| 2020-04-06 | 55 |
| 2020-04-07 | 11 |
我想生成如下结果集
| date | value | average |
| 2020-01-01 | 20 | Jan average |
| 2020-01-14 | 10 | Jan average |
| 2020-02-02 | 30 | Jan & Feb average |
| 2020-02-11 | 25 | Jan & Feb average |
| 2020-02-25 | 25 | Jan & Feb average |
| 2020-03-13 | 34 | Jan & Feb & Mar average |
| 2020-03-21 | 10 | Jan & Feb & Mar average |
| 2020-04-06 | 55 | Jan & Feb & Mar & Apr average |
| 2020-04-07 | 11 | Jan & Feb & Mar & Apr average |
我尝试使用 window 函数 OVER() 和 PARTITION() 但我设法按月而不是从年开始计算平均值。
有什么建议,请。
谢谢
我想你想要:
select
t.*,
avg(value) over(
partition by year(date)
order by month(date)
) running_avg
from mytable t
这会将每年放在一个单独的分区中,顺序按月对行进行分区。
不确定我是否理解你的问题,但如果你想要的只是 运行 每一行按年份计算的平均值:
SELECT date, value, (
SELECT AVG(value)
FROM data ds
WHERE ds.date <= d.date AND YEAR(ds.date) = YEAR(d.date)
) average
FROM data d
ORDER BY d.date ASC;
Example with MySQL(此特定查询的语法相同)
如果要在平均值中包含同一个月的后面的行,请使用 WHERE MONTH(ds.date) <= MONTH(d.date)
。
下面的查询应该给出您预期的输出-
SELECT A.*,
(
SELECT AVG(Value * 1.00)
FROM your_table B
WHERE YEAR(B.Date) = YEAR(A.DAte)
AND MONTH(B.Date) <= MONTH(A.DAte)
)
FROM your_table A
这个查询将使你每年的产出。但是,如果您不想按 YEAR 进行分区,只需从子查询中删除 YEAR 过滤器即可。
下面的查询将 return AVG 不考虑 YEAR,只是几个月之前的所有 AVG -
SELECT A.*,
(
SELECT AVG(Value * 1.00)
FROM your_table B
WHERE B.date <=
(
SELECT MAX(Date)
FROM your_table C
WHERE YEAR(c.Date) = YEAR(A.Date)
AND MONTH(C.Date) = MONTH(A.Date)
)
)
FROM your_table A
SELECT a.date,
a.value,
(Select avg(b.value) from myTable B where b.date < a.date and YEAR(a.date) = YEAR(b.date))
From myTable a
下面是我拥有的 table 数据的示例
| date | value |
| 2020-01-01 | 20 |
| 2020-01-14 | 10 |
| 2020-02-02 | 30 |
| 2020-02-11 | 25 |
| 2020-02-25 | 25 |
| 2020-03-13 | 34 |
| 2020-03-21 | 10 |
| 2020-04-06 | 55 |
| 2020-04-07 | 11 |
我想生成如下结果集
| date | value | average |
| 2020-01-01 | 20 | Jan average |
| 2020-01-14 | 10 | Jan average |
| 2020-02-02 | 30 | Jan & Feb average |
| 2020-02-11 | 25 | Jan & Feb average |
| 2020-02-25 | 25 | Jan & Feb average |
| 2020-03-13 | 34 | Jan & Feb & Mar average |
| 2020-03-21 | 10 | Jan & Feb & Mar average |
| 2020-04-06 | 55 | Jan & Feb & Mar & Apr average |
| 2020-04-07 | 11 | Jan & Feb & Mar & Apr average |
我尝试使用 window 函数 OVER() 和 PARTITION() 但我设法按月而不是从年开始计算平均值。
有什么建议,请。
谢谢
我想你想要:
select
t.*,
avg(value) over(
partition by year(date)
order by month(date)
) running_avg
from mytable t
这会将每年放在一个单独的分区中,顺序按月对行进行分区。
不确定我是否理解你的问题,但如果你想要的只是 运行 每一行按年份计算的平均值:
SELECT date, value, (
SELECT AVG(value)
FROM data ds
WHERE ds.date <= d.date AND YEAR(ds.date) = YEAR(d.date)
) average
FROM data d
ORDER BY d.date ASC;
Example with MySQL(此特定查询的语法相同)
如果要在平均值中包含同一个月的后面的行,请使用 WHERE MONTH(ds.date) <= MONTH(d.date)
。
下面的查询应该给出您预期的输出-
SELECT A.*,
(
SELECT AVG(Value * 1.00)
FROM your_table B
WHERE YEAR(B.Date) = YEAR(A.DAte)
AND MONTH(B.Date) <= MONTH(A.DAte)
)
FROM your_table A
这个查询将使你每年的产出。但是,如果您不想按 YEAR 进行分区,只需从子查询中删除 YEAR 过滤器即可。
下面的查询将 return AVG 不考虑 YEAR,只是几个月之前的所有 AVG -
SELECT A.*,
(
SELECT AVG(Value * 1.00)
FROM your_table B
WHERE B.date <=
(
SELECT MAX(Date)
FROM your_table C
WHERE YEAR(c.Date) = YEAR(A.Date)
AND MONTH(C.Date) = MONTH(A.Date)
)
)
FROM your_table A
SELECT a.date,
a.value,
(Select avg(b.value) from myTable B where b.date < a.date and YEAR(a.date) = YEAR(b.date))
From myTable a