如何识别指定值的日期 ranges/intervals?
How to you identify date ranges/intervals for specified values?
我有一组数据告诉我每个 date
的 owner
,示例数据如下。 date
列中有一些中断。
| owner | date |
|-------------+-------------+
| Samantha | 2010-01-02 |
| Max | 2010-01-03 |
| Max | 2010-01-04 |
| Max | 2010-01-06 |
| Max | 2010-01-07 |
| Conor | 2010-01-08 |
| Conor | 2010-01-09 |
| Conor | 2010-01-10 |
| Conor | 2010-01-11 |
| Abigail | 2010-01-12 |
| Abigail | 2010-01-13 |
| Abigail | 2010-01-14 |
| Abigail | 2010-01-15 |
| Max | 2010-01-17 |
| Max | 2010-01-18 |
| Abigail | 2010-01-20 |
| Conor | 2010-01-21 |
我正在尝试编写一个查询来捕获每个 owner's
时间间隔的日期范围.. 例如
| owner | start | end |
|-------------+------------+------------+
| Samantha | 2010-01-02 | 2010-01-02 |
| Max | 2010-01-03 | 2010-01-04 |
| Max | 2010-01-06 | 2010-01-07 |
| Conor | 2010-01-08 | 2010-01-11 |
| Abigail | 2010-01-12 | 2010-01-15 |
| Max | 2010-01-17 | 2010-01-18 |
| Abigail | 2010-01-20 | 2010-01-20 |
| Conor | 2010-01-21 | 2010-01-21 |
我试着用 min()
和 max()
来想这个,但我被卡住了。我觉得我需要使用 lead()
和 lag()
但不确定如何使用它们来获得我想要的输出。有任何想法吗?提前致谢!
这是一个典型的空岛问题。这是使用 row_number()
:
解决它的一种方法
select owner, min(date) start, max(date) end
from (
select
owner,
row_number() over(order by date) rn1,
row_number() over(partition by owner, order by date) rn2
from mytable
) t
group by owner, rn1 - rn2
这通过 date
对两个不同分区(在整个 table 和具有相同 owner
的组内)的记录进行排名来实现。等级之间的差异为您提供了每条记录所属的组。您可以运行内部查询并查看结果以了解逻辑。
这是一个缺口和孤岛问题。您想通过从日期中减去一个顺序值并聚合来解决它:
select owner, min(date), max(date)
from (select t.*,
row_number() over (partition by owner order by date) as seqnum
from t
) t
group by owner, (date - seqnum * interval '1 day')
order by min(date);
神奇的是,当日期值递增时,从日期中减去的序列是不变的。
我有一组数据告诉我每个 date
的 owner
,示例数据如下。 date
列中有一些中断。
| owner | date |
|-------------+-------------+
| Samantha | 2010-01-02 |
| Max | 2010-01-03 |
| Max | 2010-01-04 |
| Max | 2010-01-06 |
| Max | 2010-01-07 |
| Conor | 2010-01-08 |
| Conor | 2010-01-09 |
| Conor | 2010-01-10 |
| Conor | 2010-01-11 |
| Abigail | 2010-01-12 |
| Abigail | 2010-01-13 |
| Abigail | 2010-01-14 |
| Abigail | 2010-01-15 |
| Max | 2010-01-17 |
| Max | 2010-01-18 |
| Abigail | 2010-01-20 |
| Conor | 2010-01-21 |
我正在尝试编写一个查询来捕获每个 owner's
时间间隔的日期范围.. 例如
| owner | start | end |
|-------------+------------+------------+
| Samantha | 2010-01-02 | 2010-01-02 |
| Max | 2010-01-03 | 2010-01-04 |
| Max | 2010-01-06 | 2010-01-07 |
| Conor | 2010-01-08 | 2010-01-11 |
| Abigail | 2010-01-12 | 2010-01-15 |
| Max | 2010-01-17 | 2010-01-18 |
| Abigail | 2010-01-20 | 2010-01-20 |
| Conor | 2010-01-21 | 2010-01-21 |
我试着用 min()
和 max()
来想这个,但我被卡住了。我觉得我需要使用 lead()
和 lag()
但不确定如何使用它们来获得我想要的输出。有任何想法吗?提前致谢!
这是一个典型的空岛问题。这是使用 row_number()
:
select owner, min(date) start, max(date) end
from (
select
owner,
row_number() over(order by date) rn1,
row_number() over(partition by owner, order by date) rn2
from mytable
) t
group by owner, rn1 - rn2
这通过 date
对两个不同分区(在整个 table 和具有相同 owner
的组内)的记录进行排名来实现。等级之间的差异为您提供了每条记录所属的组。您可以运行内部查询并查看结果以了解逻辑。
这是一个缺口和孤岛问题。您想通过从日期中减去一个顺序值并聚合来解决它:
select owner, min(date), max(date)
from (select t.*,
row_number() over (partition by owner order by date) as seqnum
from t
) t
group by owner, (date - seqnum * interval '1 day')
order by min(date);
神奇的是,当日期值递增时,从日期中减去的序列是不变的。