Impala: select 使用分组依据时带条件的字段
Impala: select field with criteria when using group by
我有以下 table:
id | animal | timestamp | team
---------------------------------------
1 | dog | 2016-08-01 | blue
2 | cat | 2016-08-02 | blue
3 | bird | 2016-07-05 | red
4 | cow | 2016-08-04 | red
5 | snake | 2016-08-12 | yellow
我想为每个团队找到一只动物,条件是:如果一个团队有不止一只动物,我们将选择时间戳较晚的动物。这可能吗?谢谢!
典型的方法使用 row_number()
:
select t.*
from (select t.*,
row_number() over (partition by team order by timestamp desc) as seqnum
from t
) t
where seqnum = 1;
例如,您可以使用以下查询:
select * from teams t1 where `timestamp`=(select min(t2.`timestamp`) from teams t2 where t2.team = t1.team);
实践中:
[localhost:21000] > create table teams(id int, animal string, `timestamp` timestamp, team string);
[localhost:21000] > insert into teams values (1, "dog", "2016-08-01", "blue"), (2, "cat", "2016-08-02", "blue"), (3, "bird", "2016-07-05", "red"), (4, "cow", "2016-08-04", "red"), (5, "snake", "2016-08-12", "yellow");
[localhost:21000] > select * from teams t1 where `timestamp`=(select min(t2.`timestamp`) from teams t2 where t2.team = t1.team);
+----+--------+---------------------+--------+
| id | animal | timestamp | team |
+----+--------+---------------------+--------+
| 1 | dog | 2016-08-01 00:00:00 | blue |
| 3 | bird | 2016-07-05 00:00:00 | red |
| 5 | snake | 2016-08-12 00:00:00 | yellow |
+----+--------+---------------------+--------+
我有以下 table:
id | animal | timestamp | team
---------------------------------------
1 | dog | 2016-08-01 | blue
2 | cat | 2016-08-02 | blue
3 | bird | 2016-07-05 | red
4 | cow | 2016-08-04 | red
5 | snake | 2016-08-12 | yellow
我想为每个团队找到一只动物,条件是:如果一个团队有不止一只动物,我们将选择时间戳较晚的动物。这可能吗?谢谢!
典型的方法使用 row_number()
:
select t.*
from (select t.*,
row_number() over (partition by team order by timestamp desc) as seqnum
from t
) t
where seqnum = 1;
例如,您可以使用以下查询:
select * from teams t1 where `timestamp`=(select min(t2.`timestamp`) from teams t2 where t2.team = t1.team);
实践中:
[localhost:21000] > create table teams(id int, animal string, `timestamp` timestamp, team string);
[localhost:21000] > insert into teams values (1, "dog", "2016-08-01", "blue"), (2, "cat", "2016-08-02", "blue"), (3, "bird", "2016-07-05", "red"), (4, "cow", "2016-08-04", "red"), (5, "snake", "2016-08-12", "yellow");
[localhost:21000] > select * from teams t1 where `timestamp`=(select min(t2.`timestamp`) from teams t2 where t2.team = t1.team);
+----+--------+---------------------+--------+
| id | animal | timestamp | team |
+----+--------+---------------------+--------+
| 1 | dog | 2016-08-01 00:00:00 | blue |
| 3 | bird | 2016-07-05 00:00:00 | red |
| 5 | snake | 2016-08-12 00:00:00 | yellow |
+----+--------+---------------------+--------+