Impala 中 FIRST() 和 LAST() 的等价物

Equivalent of FIRST() and LAST() in Impala

我来自不同传感器单元的数据正在与 Hadoop 中的 unix_time stam 进行通信。使用 FIRST 和 LAST 语句很容易获得 SQL 中每一天的第一个和最后一个值,如下所示。我在这里获取一天开始和结束时的温度。

SELECT unit, FIRST(Temp) as Start_Day_Value, LAST(Temp) as End_Day_Value
FROM Sensor_Data.Table
WHERE cast(ts/1000 as TIMESTAMP) BETWEEN "2021-01-05 00:00:00" AND "2021-01-05 23:59:59"
GROUP BY unit
ORDER BY unit;

IMPALA 中是否有任何等效命令,以便我可以在进入 SQL.

时获取每列的第一个和最后一个值

标准 SQL 具有 FIRST_VALUE() window 函数,但不是聚合的等效函数。因此,在 Impala 或几乎任何其他数据库中执行您想要的操作的一种方法是:

SELECT DISTINCT unit,
       FIRST_VALUE(Temp) OVER (PARTITION BY unit ORDER BY ts) as Start_Day_Value, 
       FIRST_VALUE(Temp) OVER (PARTITION BY unit ORDER BY ts DESC) as End_Day_Value
FROM Sensor_Data.Table
WHERE cast(ts/1000 as TIMESTAMP) >= '2021-01-05' AND 
      cast(ts/1000 as TIMESTAMP) < '2021-01-06'
ORDER BY unit;

请注意,我还简化了日期比较。

注意:我不确定 cast() 是否有效(我希望代码使用 from_timestamp())。但这不是你要问的。

编辑:

在Impala中,可以使用GROUP BY:

SELECT u.unit,
       MAX(CASE WHEN seqnum_asc = 1 THEN temp END) as first_temp,
       MAX(CASE WHEN seqnum_asc = 1 THEN temp END) as lasst_temp
FROM (SELECT unit,
             ROW_NUMBER() OVER (PARTITION BY unit ORDER BY ts) as seqnum_asc, 
             ROW_NUMBER() OVER (PARTITION BY unit ORDER BY ts desc) as seqnum_desc _Day_Value
      FROM Sensor_Data.Table
      WHERE cast(ts/1000 as TIMESTAMP) >= '2021-01-05' AND 
            cast(ts/1000 as TIMESTAMP) < '2021-01-06'
     ) u
GROUP BY unit
ORDER BY unit;

我通过以下方式解决了这个问题:

SELECT DISTINCT u.unit, u.initial_temp, u.final_temp, u.initial_p, u.final_p
FROM(SELECT unit,
first_value(temperature) OVER(PARTITION BY unit ORDER BY ts DESC) as final_temp,
first_value(temperature) OVER(PARTITION BY unit ORDER BY ts ASC) as initial_temp,
first_value(pressure) OVER(PARTITION BY unit ORDER BY ts DESC) as final_p,
first_value(pressure) OVER(PARTITION BY unit ORDER BY ts ASC) as initial_p
FROM Sensor_Data.Table
WHERE cast(ts/1000 as TIMESTAMP) BETWEEN "2020-12-11 00:00:00" AND "2020-12-11 23:59:59"
) AS u
ORDER BY unit;