在Bigquery中使用TIME格式时如何计算平均时间?

How to calculate average time when it is used TIME format in Bigquery?

我正在尝试获取 AVG 时间,但 AVG 函数不支持该时间格式。我尝试使用 CAST 函数,就像在一些帖子中解释的那样,但它似乎无论如何都不起作用。谢谢

WITH october_fall AS
   (SELECT
   start_station_name,
   end_station_name,
   start_station_id,
   end_station_id,
   EXTRACT (DATE FROM started_at) AS start_date,
   EXTRACT(DAYOFWEEK FROM started_at) AS start_week_date,
   EXTRACT (TIME FROM started_at) AS start_time,    
   EXTRACT (DATE FROM ended_at) AS end_date,
   EXTRACT(DAYOFWEEK FROM ended_at) AS end_week_date,    
   EXTRACT (TIME FROM ended_at) AS end_time,
   DATETIME_DIFF (ended_at,started_at, MINUTE) AS total_lenght,
   member_casual
FROM 
   `ciclystic.cyclistic_seasonal_analysis.fall_202010` AS fall_analysis
ORDER BY 
   started_at DESC)
SELECT
   COUNT (start_week_date) AS avg_start_1,
   AVG (start_time) AS avg_start_time_1, ## here is where the problem start
   member_casual
FROM 
   october_fall
WHERE 
   start_week_date = 1
GROUP BY
   member_casual

因为 BigQuery 无法根据 TIME 类型计算 AVG,如果您尝试这样做,您会看到错误消息。

相反,您可以通过 INT64 计算 AVG。
time_ts 是时间戳格式。
我尝试使用 time_diff 来计算从时间到“00:00:00”的差异,然后我可以获得 FLOAT64 格式的秒数并将其转换为 INT64 格式。
我创建了一个函数 secondToTime。计算小时/分钟/秒并解析回时间格式非常简单。

对于日期格式,我想你可以用同样的方式来做。

create temp function secondToTime (seconds INT64)
    returns time 
    as (
        PARSE_TIME (
            "%H:%M:%S",
            concat(
                cast(seconds / 3600 as int),
                ":",
                cast(mod(seconds, 3600) / 60 as int),
                ":",
                mod(seconds, 60)
            )
        )
    );


with october_fall as (
    select
        extract (date from time_ts) as start_date,
        extract (time from time_ts) as start_time
    from `bigquery-public-data.hacker_news.comments`
    limit 10
) SELECT 
    avg(time_diff(start_time, time '00:00:00', second)),
    secondToTime(
        cast(avg(time_diff(start_time, time '00:00:00', second)) as INT64) 
    ),
    secondToTime(0),
    secondToTime(60),
    secondToTime(3601),
    secondToTime(7265)
FROM october_fall

试试下面

SELECT
   COUNT (start_week_date) AS avg_start_1,
   TIME(
     EXTRACT(hour   FROM AVG(start_time - '0:0:0')), 
     EXTRACT(minute FROM AVG(start_time - '0:0:0')), 
     EXTRACT(second FROM AVG(start_time - '0:0:0'))
   ) as avg_start_time_1
   member_casual
FROM 
   october_fall
WHERE 
   start_week_date = 1
GROUP BY
   member_casual     

另一种选择是

SELECT
   COUNT (start_week_date) AS avg_start_1,
   PARSE_TIME('0-0 0 %H:%M:%E*S', '' || AVG(start_time - '0:0:0')) as avg_start_time_1
   member_casual
FROM 
   october_fall
WHERE 
   start_week_date = 1
GROUP BY
   member_casual