如何使用 SQL 对数据进行分组
How to group data with SQL
如何对我的时间流数据进行分组?
table 看起来像这样简化:
point_delivery_number | measure_name | time | value
------------------------------------------------------------------------
AT3265345345 | "consumption" | 2021-01-02 12:00:00.00 | 0.13
AT3265345345 | "generation" | 2021-01-02 12:00:00.00 | 0.32
我想查询where point_delivery_number
== xx and time
= xx
结果应该是:
point_delivery_number | consumption | time | generation
----------------------------------------------------------
AT3265345345 | 0.13 | xxxxx | 0.32
我试过的是:
SELECT point_delivery_number, measure_name, time, measure_value::double
FROM "energy_datapoints"."energy_data"
WHERE point_delivery_number='AT234123234541243'
GROUP BY point_delivery_number, measure_name, time, measure_value::double;
结果是:
point_delivery_number | measure_name | time | value
------------------------------------------------------------------------
AT3265345345 | "generation" | 2021-01-02 12:15:00.00 | 0.123
AT3265345345 | "generation" | 2021-01-02 12:00:00.00 | 0.32
我希望 consumption
和 generation
成为一个 属性 而不是一个值。
您正在处理 key/value table。每 point_delivery_number 它包含带有键 (measure_name) 和值(时间和值)的行。
您想获取两个键的值。一种方法是 select 两者并加入它们:
select
point_delivery_number,
c.value as consumption,
g.value as generation
from
(select * from energy_datapoints.energy_data where measure_name = 'consumption') c
full outer join
(select * from energy_datapoints.energy_data where measure_name = 'generation') g
using (point_delivery_number)
order by point_delivery_number;
另一种方式是聚合。每个 point_delivery_number 需要一行,所以 GROUP BY point_delivery_number
。然后在条件下使用 MIN
或 MAX
以仅获取有问题的度量名称。
select
point_delivery_number,
min(case when measure_name = 'consumption' then value end) as consumption,
min(case when measure_name = 'generation' then value end) as generation
from energy_datapoints.energy_data
group by point_delivery_number
order by point_delivery_number;
免责声明:我不了解 Amazon Timestream。上面的查询是标准的 SQL 查询,应该可以在大多数 RDBMS 中工作(完全按照书面形式或稍作修改)。
至于你自己的查询:你让它看起来像是在聚合,但看起来你只是 selecting 单行,因为你的 GROUP BY
子句包括所有列。 GROUP BY ____
表示“我想聚合我的数据以获得每个 ____ 的一个结果行”。每个 point_delivery_number 需要一个结果行,因此 GROUP BY point_delivery_number
.
如何对我的时间流数据进行分组?
table 看起来像这样简化:
point_delivery_number | measure_name | time | value
------------------------------------------------------------------------
AT3265345345 | "consumption" | 2021-01-02 12:00:00.00 | 0.13
AT3265345345 | "generation" | 2021-01-02 12:00:00.00 | 0.32
我想查询where point_delivery_number
== xx and time
= xx
结果应该是:
point_delivery_number | consumption | time | generation
----------------------------------------------------------
AT3265345345 | 0.13 | xxxxx | 0.32
我试过的是:
SELECT point_delivery_number, measure_name, time, measure_value::double
FROM "energy_datapoints"."energy_data"
WHERE point_delivery_number='AT234123234541243'
GROUP BY point_delivery_number, measure_name, time, measure_value::double;
结果是:
point_delivery_number | measure_name | time | value
------------------------------------------------------------------------
AT3265345345 | "generation" | 2021-01-02 12:15:00.00 | 0.123
AT3265345345 | "generation" | 2021-01-02 12:00:00.00 | 0.32
我希望 consumption
和 generation
成为一个 属性 而不是一个值。
您正在处理 key/value table。每 point_delivery_number 它包含带有键 (measure_name) 和值(时间和值)的行。
您想获取两个键的值。一种方法是 select 两者并加入它们:
select
point_delivery_number,
c.value as consumption,
g.value as generation
from
(select * from energy_datapoints.energy_data where measure_name = 'consumption') c
full outer join
(select * from energy_datapoints.energy_data where measure_name = 'generation') g
using (point_delivery_number)
order by point_delivery_number;
另一种方式是聚合。每个 point_delivery_number 需要一行,所以 GROUP BY point_delivery_number
。然后在条件下使用 MIN
或 MAX
以仅获取有问题的度量名称。
select
point_delivery_number,
min(case when measure_name = 'consumption' then value end) as consumption,
min(case when measure_name = 'generation' then value end) as generation
from energy_datapoints.energy_data
group by point_delivery_number
order by point_delivery_number;
免责声明:我不了解 Amazon Timestream。上面的查询是标准的 SQL 查询,应该可以在大多数 RDBMS 中工作(完全按照书面形式或稍作修改)。
至于你自己的查询:你让它看起来像是在聚合,但看起来你只是 selecting 单行,因为你的 GROUP BY
子句包括所有列。 GROUP BY ____
表示“我想聚合我的数据以获得每个 ____ 的一个结果行”。每个 point_delivery_number 需要一个结果行,因此 GROUP BY point_delivery_number
.