从重复组中获取时间段的开始
Get start of a time period from repeated groups
id - 动作发生地点的id
t - 作用时间
+----+----------+
| id | t |
+----+----------+
| 1 | 12:10:00 |
| 1 | 12:10:05 |
| 1 | 12:11:00 |
| 1 | 13:04:03 |
| 2 | 14:18:05 |
| 2 | 15:00:09 |
| 3 | 17:33:50 |
| 1 | 20:03:14 |
| 1 | 20:03:55 |
| 1 | 20:10:23 |
+----+----------+
目标是得到这个输出
+----+----------+
| id | start |
+----+----------+
| 1 | 12:10:00 |
| 2 | 14:18:05 |
| 3 | 17:33:50 |
| 1 | 20:03:14 |
+----+----------+
start - 在 id
处的第一次操作时间
具有排名、最小值等的脚本将 id=1
的行分组
我不知道如何解决这个问题,也没有找到类似的 post
这是 sqlfiddle 和脚本
提前致谢!
这是一个典型的gap and islands问题,你可以使用一些解析函数,比如ROW_NUMBER()
,LAG()
,LEAD()
等。我们主要考虑通过操作两次应用解析函数PARTITION
选项,并从另一个结果中减去一个结果,例如
SELECT DISTINCT tt.ID, FIRST_VALUE(t) OVER W AS start
FROM (SELECT t.*,
ROW_NUMBER() OVER(ORDER BY t)
- ROW_NUMBER() OVER(PARTITION BY ID ORDER BY t) AS rn
FROM (SELECT ID, t FROM records) t) tt WINDOW W AS
(PARTITION BY rn ORDER BY t ROWS
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY start;
+----+----------+
| id | start |
+----+----------+
| 1 | 12:10:00 |
| 2 | 14:18:05 |
| 3 | 17:33:50 |
| 1 | 20:03:14 |
+----+----------+
这类问题可以称为gaps-and-islands问题,可以通过行数和聚合的差异来实现。
select id,min(t),min(h)
from
(
select id
,t
,extract(hour from t) h
,row_number() over (order by t) as seq1
,row_number() over (partition by id order by t) as seq2
from records
) t
group by id,(seq1-seq2)
order by min(t);
参考:db<>fiddle
解决这个问题最简单的方法是使用lag()
:
select id, t as start
from (select t.*, lag(id) over (order by t) as prev_id
from t
) t
where prev_id is distinct from id;
基本上,您只需要 id
更改时的值。
注意:我认为将此视为“典型的”gaps-and-islands 问题有点矫枉过正,会使解决方案复杂化。
id - 动作发生地点的id
t - 作用时间
+----+----------+
| id | t |
+----+----------+
| 1 | 12:10:00 |
| 1 | 12:10:05 |
| 1 | 12:11:00 |
| 1 | 13:04:03 |
| 2 | 14:18:05 |
| 2 | 15:00:09 |
| 3 | 17:33:50 |
| 1 | 20:03:14 |
| 1 | 20:03:55 |
| 1 | 20:10:23 |
+----+----------+
目标是得到这个输出
+----+----------+
| id | start |
+----+----------+
| 1 | 12:10:00 |
| 2 | 14:18:05 |
| 3 | 17:33:50 |
| 1 | 20:03:14 |
+----+----------+
start - 在 id
具有排名、最小值等的脚本将 id=1
的行分组
我不知道如何解决这个问题,也没有找到类似的 post
这是 sqlfiddle 和脚本
提前致谢!
这是一个典型的gap and islands问题,你可以使用一些解析函数,比如ROW_NUMBER()
,LAG()
,LEAD()
等。我们主要考虑通过操作两次应用解析函数PARTITION
选项,并从另一个结果中减去一个结果,例如
SELECT DISTINCT tt.ID, FIRST_VALUE(t) OVER W AS start
FROM (SELECT t.*,
ROW_NUMBER() OVER(ORDER BY t)
- ROW_NUMBER() OVER(PARTITION BY ID ORDER BY t) AS rn
FROM (SELECT ID, t FROM records) t) tt WINDOW W AS
(PARTITION BY rn ORDER BY t ROWS
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY start;
+----+----------+
| id | start |
+----+----------+
| 1 | 12:10:00 |
| 2 | 14:18:05 |
| 3 | 17:33:50 |
| 1 | 20:03:14 |
+----+----------+
这类问题可以称为gaps-and-islands问题,可以通过行数和聚合的差异来实现。
select id,min(t),min(h)
from
(
select id
,t
,extract(hour from t) h
,row_number() over (order by t) as seq1
,row_number() over (partition by id order by t) as seq2
from records
) t
group by id,(seq1-seq2)
order by min(t);
参考:db<>fiddle
解决这个问题最简单的方法是使用lag()
:
select id, t as start
from (select t.*, lag(id) over (order by t) as prev_id
from t
) t
where prev_id is distinct from id;
基本上,您只需要 id
更改时的值。
注意:我认为将此视为“典型的”gaps-and-islands 问题有点矫枉过正,会使解决方案复杂化。