如何在 Oracle SQL 上查询以获取时间间隔,按特定字段分组
How to do a query on Oracle SQL to get time intervals, grouping by specific fields
我喜欢好的挑战,但这个让我头疼的时间太长了。 :)
我正在尝试构建查询以获取日期间隔,并按一个字段对信息进行分组。
让我试着用简单的方式解释一下。
我们有这个 table:
我需要得到一个士兵在每个排名上花费的时间间隔,所以我需要得到的最终结果应该是这样的:
正如你所见,士兵可以 promoted/demoted 一直。
关于如何构建查询来执行此操作的任何建议?
谢谢!
这是一种间隙和孤岛问题。您想要找到相同的行组,您可以使用 lag()
来比较 ranking
,然后使用累计和来跟踪更改:
select soldier_id, soldier_name, ranking,
min(start_date), max(end_date)
from (select t.*,
sum(case when prev_end_date = start_date - interval '1' day then 0 else 1 end)
(partition by soldier_id order by start_date) as island
from (select t.*,
lag(end_date) over (partition by soldier_id, ranking order by start_date) as prev_end_date
from t
) t
) t
group by soldier_id, soldier_name, ranking, island;
注意:这假设 soldier_name
不会随时间改变给定士兵。如果这是您需要处理的事情,请提出一个 new 问题,并提供适当的示例数据和所需的结果。
从 Oracle 12 开始,您可以使用 MATCH_RECOGNIZE
:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY start_date, end_date
MEASURES
FIRST( name ) AS name,
FIRST( ranking ) AS ranking,
FIRST( start_date ) AS start_date,
LAST( end_Date ) AS end_Date
PATTERN ( same_rank+ )
DEFINE same_rank AS FIRST( ranking ) = ranking
)
其中,对于示例数据:
CREATE TABLE table_name ( id, name, ranking, start_date, end_date ) AS
SELECT 1001, 'Jones', 'Lieutenant', DATE '2000-03-20', DATE '2002-08-15' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2002-08-16', DATE '2003-03-18' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2003-03-19', DATE '2004-06-01' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2004-06-02', DATE '2004-10-01' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2004-10-02', DATE '2005-04-20' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2005-04-21', DATE '2007-02-20' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2007-02-21', DATE '2008-10-22' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2008-10-23', DATE '2010-01-26' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2010-01-27', DATE '2013-11-25' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2013-11-26', DATE '2014-05-11' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2014-05-12', DATE '2016-04-22' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'General', DATE '2016-04-23', DATE '2020-10-10' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'General', DATE '2020-10-11', DATE '2020-11-30' FROM DUAL;
输出:
ID | NAME | RANKING | START_DATE | END_DATE
---: | :---- | :--------- | :------------------ | :------------------
1001 | Jones | Lieutenant | 2000-03-20 00:00:00 | 2004-10-01 00:00:00
1001 | Jones | Captain | 2004-10-02 00:00:00 | 2007-02-20 00:00:00
1001 | Jones | Major | 2007-02-21 00:00:00 | 2010-01-26 00:00:00
1001 | Jones | Captain | 2010-01-27 00:00:00 | 2014-05-11 00:00:00
1001 | Jones | Major | 2014-05-12 00:00:00 | 2016-04-22 00:00:00
1001 | Jones | General | 2016-04-23 00:00:00 | 2020-11-30 00:00:00
db<>fiddle here
我喜欢好的挑战,但这个让我头疼的时间太长了。 :)
我正在尝试构建查询以获取日期间隔,并按一个字段对信息进行分组。
让我试着用简单的方式解释一下。 我们有这个 table:
我需要得到一个士兵在每个排名上花费的时间间隔,所以我需要得到的最终结果应该是这样的:
正如你所见,士兵可以 promoted/demoted 一直。
关于如何构建查询来执行此操作的任何建议?
谢谢!
这是一种间隙和孤岛问题。您想要找到相同的行组,您可以使用 lag()
来比较 ranking
,然后使用累计和来跟踪更改:
select soldier_id, soldier_name, ranking,
min(start_date), max(end_date)
from (select t.*,
sum(case when prev_end_date = start_date - interval '1' day then 0 else 1 end)
(partition by soldier_id order by start_date) as island
from (select t.*,
lag(end_date) over (partition by soldier_id, ranking order by start_date) as prev_end_date
from t
) t
) t
group by soldier_id, soldier_name, ranking, island;
注意:这假设 soldier_name
不会随时间改变给定士兵。如果这是您需要处理的事情,请提出一个 new 问题,并提供适当的示例数据和所需的结果。
从 Oracle 12 开始,您可以使用 MATCH_RECOGNIZE
:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY start_date, end_date
MEASURES
FIRST( name ) AS name,
FIRST( ranking ) AS ranking,
FIRST( start_date ) AS start_date,
LAST( end_Date ) AS end_Date
PATTERN ( same_rank+ )
DEFINE same_rank AS FIRST( ranking ) = ranking
)
其中,对于示例数据:
CREATE TABLE table_name ( id, name, ranking, start_date, end_date ) AS
SELECT 1001, 'Jones', 'Lieutenant', DATE '2000-03-20', DATE '2002-08-15' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2002-08-16', DATE '2003-03-18' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2003-03-19', DATE '2004-06-01' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2004-06-02', DATE '2004-10-01' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2004-10-02', DATE '2005-04-20' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2005-04-21', DATE '2007-02-20' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2007-02-21', DATE '2008-10-22' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2008-10-23', DATE '2010-01-26' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2010-01-27', DATE '2013-11-25' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2013-11-26', DATE '2014-05-11' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2014-05-12', DATE '2016-04-22' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'General', DATE '2016-04-23', DATE '2020-10-10' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'General', DATE '2020-10-11', DATE '2020-11-30' FROM DUAL;
输出:
ID | NAME | RANKING | START_DATE | END_DATE ---: | :---- | :--------- | :------------------ | :------------------ 1001 | Jones | Lieutenant | 2000-03-20 00:00:00 | 2004-10-01 00:00:00 1001 | Jones | Captain | 2004-10-02 00:00:00 | 2007-02-20 00:00:00 1001 | Jones | Major | 2007-02-21 00:00:00 | 2010-01-26 00:00:00 1001 | Jones | Captain | 2010-01-27 00:00:00 | 2014-05-11 00:00:00 1001 | Jones | Major | 2014-05-12 00:00:00 | 2016-04-22 00:00:00 1001 | Jones | General | 2016-04-23 00:00:00 | 2020-11-30 00:00:00
db<>fiddle here