Impala/SQL: 我可以为每个组设置不同的时间段吗?
Impala/SQL: Can I have different time-period for each group?
我有以下 table:
id | timestamp | team
----------------------------
1 | 2016-05-06 | A
2 | 2016-03-02 | A
3 | 2015-12-01 | A
4 | 2016-07-05 | B
5 | 2016-06-30 | B
6 | 2016-06-28 | B
7 | 2016-04-05 | C
8 | 2016-04-02 | C
9 | 2016-01-02 | C
我想按团队分组并找到每个团队的最后时间戳,所以我做了:
select team, max(timestamp) from my_table group by team
目前一切正常。但是,现在我想知道每个团队在上个月有多少个不同的 id。例如,对于A队,它是从2016-04-07到2016-05-06,所以这样的计数是1。对于B队,最后一个月是从2016-06-06到2016-07-05,所以计数是 3。对于 C 队,上个月是从 2016-03-06 到 2016-04-05,计数是 2。我的预期输出应该是这样的:
team | max(timestamp) | count_in_last_month
------------------------------------------------
A | 2016-05-06 | 1
B | 2016-07-05 | 3
C | 2016-04-05 | 2
这可以使用 Impala 查询导出吗?谢谢!
将原始 table 与获取最大时间戳的子查询合并。
SELECT t1.team, t2.month_end, COUNT(DISTINCT t1.id) AS count_in_last_month
FROM my_table AS t1
JOIN (SELECT team, MAX(timestamp) AS month_end
FROM my_table
GROUP BY team) AS t2
ON t1.team = t2.team
AND t1.timestamp BETWEEN DATE_SUB(month_end, INTERVAL 1 MONTH) AND month_end
GROUP BY t1.team, t2.month_end
我有以下 table:
id | timestamp | team
----------------------------
1 | 2016-05-06 | A
2 | 2016-03-02 | A
3 | 2015-12-01 | A
4 | 2016-07-05 | B
5 | 2016-06-30 | B
6 | 2016-06-28 | B
7 | 2016-04-05 | C
8 | 2016-04-02 | C
9 | 2016-01-02 | C
我想按团队分组并找到每个团队的最后时间戳,所以我做了:
select team, max(timestamp) from my_table group by team
目前一切正常。但是,现在我想知道每个团队在上个月有多少个不同的 id。例如,对于A队,它是从2016-04-07到2016-05-06,所以这样的计数是1。对于B队,最后一个月是从2016-06-06到2016-07-05,所以计数是 3。对于 C 队,上个月是从 2016-03-06 到 2016-04-05,计数是 2。我的预期输出应该是这样的:
team | max(timestamp) | count_in_last_month
------------------------------------------------
A | 2016-05-06 | 1
B | 2016-07-05 | 3
C | 2016-04-05 | 2
这可以使用 Impala 查询导出吗?谢谢!
将原始 table 与获取最大时间戳的子查询合并。
SELECT t1.team, t2.month_end, COUNT(DISTINCT t1.id) AS count_in_last_month
FROM my_table AS t1
JOIN (SELECT team, MAX(timestamp) AS month_end
FROM my_table
GROUP BY team) AS t2
ON t1.team = t2.team
AND t1.timestamp BETWEEN DATE_SUB(month_end, INTERVAL 1 MONTH) AND month_end
GROUP BY t1.team, t2.month_end