如何在配置单元中获取时间戳('yyyy-mm-dd hh:mm:ss')的平均值(平均值)?
How to take average(mean) of timestamp('yyyy-mm-dd hh:mm:ss') in hive?
我有一个日志 table。看起来像这样:-
user_name idle_hours working_hours start_time stop_time
sahil24c@gmail.com 2019-10-24 05:05:00 2019-10-24 05:50:00 2019-10-24 08:30:02 2019-10-24 19:25:02
magadum@gmail.com 2019-10-24 02:15:00 2019-10-24 08:39:59 2019-10-24 08:30:02 2019-10-24 19:25:01
yathink3@gmail.com 2019-10-24 01:30:00 2019-10-24 09:24:59 2019-10-24 08:30:02 2019-10-24 19:25:01
shelkeva@gmail.com 2019-10-24 00:30:00 2019-10-24 09:10:01 2019-10-24 08:45:01 2019-10-24 18:25:02
puruissim@gmail.com 2019-10-24 03:15:00 2019-10-24 07:19:59 2019-10-24 08:50:02 2019-10-24 19:25:01
sangita.awa@gmail.com 2019-10-24 01:55:00 2019-10-24 08:40:00 2019-10-24 08:50:01 2019-10-24 19:25:01
vaishusawan@gmail.com 2019-10-24 00:35:00 2019-10-24 09:55:00 2019-10-24 08:55:01 2019-10-24 19:25:01
you@example.com 2019-10-24 02:35:00 2019-10-24 08:04:59 2019-10-24 08:45:02 2019-10-24 19:25:01
samadhanma@gmail.com 2019-10-24 01:10:00 2019-10-24 08:39:59 2019-10-24 09:00:02 2019-10-24 18:50:01
我想知道平均工作时间。
select * from workinglogs where unix_timestamp(working_hours) < AVG(unix_timestamp(working_hours));
当我 运行 这个查询时它不起作用。
错误显示:- 失败:SemanticException [错误 10128]:第 1:64 行尚不支持 UDAF 'AVG'
既然你用的是UDAF,那你就得用group by。 Select 每列(不要使用 *),然后对选定的列进行分组。
Select col1, col2, col3,......coln from workinglogs group by col1, col2, col3,......coln having unix_timestamp(working_hours) < AVG(unix_timestamp(working_hours));
您可以采用这种方法
计算 AVG 的子查询和过滤输出的查询
以您的数据为例
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
| workinglogs.user_name | workinglogs.idle_hours | workinglogs.working_hours | workinglogs.start_time | workinglogs.stop_time |
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
| magadum@gmail.com | 2019-10-24 02:15:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 08:30:02.0 | 2019-10-24 19:25:01.0 |
| yathink3@gmail.com | 2019-10-24 01:30:00.0 | 2019-10-24 09:24:59.0 | 2019-10-24 08:30:02.0 | 2019-10-24 19:25:01.0 |
| shelkeva@gmail.com | 2019-10-24 00:30:00.0 | 2019-10-24 09:10:01.0 | 2019-10-24 08:45:01.0 | 2019-10-24 18:25:02.0 |
| puruissim@gmail.com | 2019-10-24 03:15:00.0 | 2019-10-24 07:19:59.0 | 2019-10-24 08:50:02.0 | 2019-10-24 19:25:01.0 |
| sangita.awa@gmail.com | 2019-10-24 01:55:00.0 | 2019-10-24 08:40:00.0 | 2019-10-24 08:50:01.0 | 2019-10-24 19:25:01.0 |
| vaishusawan@gmail.com | 2019-10-24 00:35:00.0 | 2019-10-24 09:55:00.0 | 2019-10-24 08:55:01.0 | 2019-10-24 19:25:01.0 |
| you@example.com | 2019-10-24 02:35:00.0 | 2019-10-24 08:04:59.0 | 2019-10-24 08:45:02.0 | 2019-10-24 19:25:01.0 |
| samadhanma@gmail.com | 2019-10-24 01:10:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 09:00:02.0 | 2019-10-24 18:50:01.0 |
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
带子查询的查询
WITH t AS(
SELECT ROUND(AVG(unix_timestamp(working_hours)),2) as average
FROM workinglogs)
SELECT w.user_name,w.idle_hours,w.working_hours,w.start_time,w.stop_time
FROM workinglogs AS w,t
WHERE unix_timestamp(w.working_hours) < t.average;
输出
+------------------------+------------------------+------------------------+------------------------+------------------------+--+
| w.user_name | w.idle_hours | w.working_hours | w.start_time | w.stop_time |
+------------------------+------------------------+------------------------+------------------------+------------------------+--+
| magadum@gmail.com | 2019-10-24 02:15:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 08:30:02.0 | 2019-10-24 19:25:01.0 |
| puruissim@gmail.com | 2019-10-24 03:15:00.0 | 2019-10-24 07:19:59.0 | 2019-10-24 08:50:02.0 | 2019-10-24 19:25:01.0 |
| sangita.awa@gmail.com | 2019-10-24 01:55:00.0 | 2019-10-24 08:40:00.0 | 2019-10-24 08:50:01.0 | 2019-10-24 19:25:01.0 |
| you@example.com | 2019-10-24 02:35:00.0 | 2019-10-24 08:04:59.0 | 2019-10-24 08:45:02.0 | 2019-10-24 19:25:01.0 |
| samadhanma@gmail.com | 2019-10-24 01:10:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 09:00:02.0 | 2019-10-24 18:50:01.0 |
+------------------------+------------------------+------------------------+------------------------+------------------------+--+
我有一个日志 table。看起来像这样:-
user_name idle_hours working_hours start_time stop_time
sahil24c@gmail.com 2019-10-24 05:05:00 2019-10-24 05:50:00 2019-10-24 08:30:02 2019-10-24 19:25:02
magadum@gmail.com 2019-10-24 02:15:00 2019-10-24 08:39:59 2019-10-24 08:30:02 2019-10-24 19:25:01
yathink3@gmail.com 2019-10-24 01:30:00 2019-10-24 09:24:59 2019-10-24 08:30:02 2019-10-24 19:25:01
shelkeva@gmail.com 2019-10-24 00:30:00 2019-10-24 09:10:01 2019-10-24 08:45:01 2019-10-24 18:25:02
puruissim@gmail.com 2019-10-24 03:15:00 2019-10-24 07:19:59 2019-10-24 08:50:02 2019-10-24 19:25:01
sangita.awa@gmail.com 2019-10-24 01:55:00 2019-10-24 08:40:00 2019-10-24 08:50:01 2019-10-24 19:25:01
vaishusawan@gmail.com 2019-10-24 00:35:00 2019-10-24 09:55:00 2019-10-24 08:55:01 2019-10-24 19:25:01
you@example.com 2019-10-24 02:35:00 2019-10-24 08:04:59 2019-10-24 08:45:02 2019-10-24 19:25:01
samadhanma@gmail.com 2019-10-24 01:10:00 2019-10-24 08:39:59 2019-10-24 09:00:02 2019-10-24 18:50:01
我想知道平均工作时间。
select * from workinglogs where unix_timestamp(working_hours) < AVG(unix_timestamp(working_hours));
当我 运行 这个查询时它不起作用。
错误显示:- 失败:SemanticException [错误 10128]:第 1:64 行尚不支持 UDAF 'AVG'
既然你用的是UDAF,那你就得用group by。 Select 每列(不要使用 *),然后对选定的列进行分组。
Select col1, col2, col3,......coln from workinglogs group by col1, col2, col3,......coln having unix_timestamp(working_hours) < AVG(unix_timestamp(working_hours));
您可以采用这种方法
计算 AVG 的子查询和过滤输出的查询
以您的数据为例
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
| workinglogs.user_name | workinglogs.idle_hours | workinglogs.working_hours | workinglogs.start_time | workinglogs.stop_time |
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
| magadum@gmail.com | 2019-10-24 02:15:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 08:30:02.0 | 2019-10-24 19:25:01.0 |
| yathink3@gmail.com | 2019-10-24 01:30:00.0 | 2019-10-24 09:24:59.0 | 2019-10-24 08:30:02.0 | 2019-10-24 19:25:01.0 |
| shelkeva@gmail.com | 2019-10-24 00:30:00.0 | 2019-10-24 09:10:01.0 | 2019-10-24 08:45:01.0 | 2019-10-24 18:25:02.0 |
| puruissim@gmail.com | 2019-10-24 03:15:00.0 | 2019-10-24 07:19:59.0 | 2019-10-24 08:50:02.0 | 2019-10-24 19:25:01.0 |
| sangita.awa@gmail.com | 2019-10-24 01:55:00.0 | 2019-10-24 08:40:00.0 | 2019-10-24 08:50:01.0 | 2019-10-24 19:25:01.0 |
| vaishusawan@gmail.com | 2019-10-24 00:35:00.0 | 2019-10-24 09:55:00.0 | 2019-10-24 08:55:01.0 | 2019-10-24 19:25:01.0 |
| you@example.com | 2019-10-24 02:35:00.0 | 2019-10-24 08:04:59.0 | 2019-10-24 08:45:02.0 | 2019-10-24 19:25:01.0 |
| samadhanma@gmail.com | 2019-10-24 01:10:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 09:00:02.0 | 2019-10-24 18:50:01.0 |
+------------------------+-------------------------+----------------------------+-------------------------+------------------------+--+
带子查询的查询
WITH t AS(
SELECT ROUND(AVG(unix_timestamp(working_hours)),2) as average
FROM workinglogs)
SELECT w.user_name,w.idle_hours,w.working_hours,w.start_time,w.stop_time
FROM workinglogs AS w,t
WHERE unix_timestamp(w.working_hours) < t.average;
输出
+------------------------+------------------------+------------------------+------------------------+------------------------+--+
| w.user_name | w.idle_hours | w.working_hours | w.start_time | w.stop_time |
+------------------------+------------------------+------------------------+------------------------+------------------------+--+
| magadum@gmail.com | 2019-10-24 02:15:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 08:30:02.0 | 2019-10-24 19:25:01.0 |
| puruissim@gmail.com | 2019-10-24 03:15:00.0 | 2019-10-24 07:19:59.0 | 2019-10-24 08:50:02.0 | 2019-10-24 19:25:01.0 |
| sangita.awa@gmail.com | 2019-10-24 01:55:00.0 | 2019-10-24 08:40:00.0 | 2019-10-24 08:50:01.0 | 2019-10-24 19:25:01.0 |
| you@example.com | 2019-10-24 02:35:00.0 | 2019-10-24 08:04:59.0 | 2019-10-24 08:45:02.0 | 2019-10-24 19:25:01.0 |
| samadhanma@gmail.com | 2019-10-24 01:10:00.0 | 2019-10-24 08:39:59.0 | 2019-10-24 09:00:02.0 | 2019-10-24 18:50:01.0 |
+------------------------+------------------------+------------------------+------------------------+------------------------+--+