从重叠的日期范围中获取不同的连续日期范围
Get distinct consecutive date ranges from overlapping date ranges
我需要从重叠日期列表中获取彼此不重叠的日期范围列表,并获取重叠期间的硬币总和。我试过用谷歌搜索一个例子,但到目前为止还没有运气。我可能没有使用正确的关键词?
我有一个重叠日期列表
1.1.2018 - 31.1.2018 80
7.1.2018 - 10.1.2018 10
7.1.2018 - 31.1.2018 10
11.1.2018 - 31.1.2018 5
25.1.2018 - 27.1.2018 5
2.2.2018 - 23.2.2018 100
期望的结果是
1.1.2018 - 6.7.2018 80 coins
7.1.2018 - 10.1.2018 100 coins
11.1.2018 - 24.1.2018 95 coins
25.1.2018 - 27.1.2018 100 coins
28.1.2018 - 31.1.2018 95 coins
2.2.2018 - 23.2.2018 100 coins
这是它应该如何工作的图
|------------------------------|
|---|
|-----------------------|
|-------------------|
|---|
|----------------------|
Outcome
|------|---|----------|---|----| |----------------------|
80 100 95 100 95 100
这是我的测试数据
drop table coinsonperiod2;
create table coinsonperiod2(
id serial,
startdate date,
enddate date,
coins integer,
userid integer
);
insert into coinsonperiod2 (startdate, enddate, coins,userid) values
('2018-01-01','2018-01-31', 80,1)
, ('2018-01-07','2018-01-10', 10,1)
, ('2018-01-07','2018-01-31', 10,1)
, ('2018-01-11','2018-01-31', 5,1)
, ('2018-01-25','2018-01-27', 5,1)
, ('2018-02-02','2018-02-23', 100,2)
, ('2018-01-01','2018-01-31', 80,2)
, ('2018-01-07','2018-01-10', 10,2)
, ('2018-01-07','2018-01-31', 10,2)
, ('2018-01-11','2018-01-31', 5,2)
, ('2018-01-25','2018-01-27', 5,2)
, ('2018-02-02','2018-02-23', 100,3)
;
更新:
实际上 StephenM 和 joops 的答案不符合我想要的结果。两个答案都显示结束日期错误。
当一个时期结束时,下一个时期应该在第二天开始(如果有间隔,则更晚)。在我想要的结果中,1.1.2018-6.1.2018 包括第 6 天。 6th和7th之间没有差距,因为7th包含在7.1.2018-10.1.2018.
更新 2:
现在我明白了开区间、半开区间和闭区间的区别。在 joops 解决方案中,必须针对半开区间进行计算,但我想要的结果是闭区间。这就是为什么必须减少 enddate 以使结果成为闭区间的原因。如果我错了请纠正我。
我还在示例数据中添加了userid,并进一步修改了joops解决方案。
这是给我想要的结果的查询。
with changes AS (
SELECT
userid,
startdate AS tickdate,
coins,
1 AS cover
FROM coinsonperiod2
UNION ALL
-- add 1 day to correct intervals into half open intervals, so the calculation is correct
SELECT
userid,
1 + enddate AS tickdate,
-1 * coins,
-1 AS cover
FROM coinsonperiod2
)
, sumchanges AS (
SELECT
userid,
tickdate,
SUM(coins) AS change,
SUM(cover) AS cover
FROM changes
GROUP BY tickdate, userid
)
, aggregated AS (
SELECT
userid AS userid,
tickdate AS startdate,
lead(tickdate)
over www AS enddate,
sum(change)
OVER www AS cash,
sum(cover)
OVER www AS cover
FROM sumchanges
WINDOW www AS (
partition by userid
ORDER BY tickdate )
)
-- reduce 1 day from the enddate to make closed interval
SELECT
userid
, startdate
, enddate-1 as enddate
, cash
, cover
FROM aggregated
WHERE cover > 0
ORDER BY userid, startdate
;
结果:
好的,所以我将帮助您了解逻辑位,您可以在网上找到的语法。
您可以做的是创建一个临时文件 table 并将数据移到那里,然后 select 每行数据和每列数据将值存储在声明的变量中。
然后简单地使用一个游标,select 来自您的来源的所有数据 table 并再次使用一个正常的大于或小于运算符,并按照您的方式计算。
简单获取第 1 行第 1 列,与所有其他第 1 列和第 2 列数据进行比较。
看来我找到了一个丑陋的作品
select t1.dt, t1.enddt, sum(coins)
from (
select distinct cp1.dt, min(cp2.dt) enddt
from ( select startdate as dt from coinsonperiod union all select enddate as dt from coinsonperiod ) cp1,
( select startdate as dt from coinsonperiod union all select enddate as dt from coinsonperiod ) cp2
where cp2.dt > cp1.dt
group by cp1.dt
order by cp1.dt ) t1, coinsonperiod t2
where t1.dt between t2.startdate and t2.enddate
and t1.enddt between t2.startdate and t2.enddate
group by t1.dt, t1.enddt
输出:
dt |enddt |sum |
-----------|-----------|----|
2018-01-01 |2018-01-07 |80 |
2018-01-07 |2018-01-10 |100 |
2018-01-10 |2018-01-11 |90 |
2018-01-11 |2018-01-25 |95 |
2018-01-25 |2018-01-27 |100 |
2018-01-27 |2018-01-31 |95 |
2018-02-02 |2018-02-23 |100 |
与你的输出唯一不同的是我想你忘记了 01/10 和 01/11 之间的间隔
逻辑是:
- 在间隔的开始 将其值添加到累计总和
- 在间隔的 结束时 从这个总和中减去它的值
- 但是为了扫除日期变更线,我们必须收集所有(独特的)date/time 邮票,无论是开始还是停止。
所以重点是:将数据从一系列 intervals 转换为一系列 (start/stop) events ,并汇总这些。
-- \i tmp.sql
create table coinsonperiod(
id serial,
startdate date,
enddate date,
coins integer
);
insert into coinsonperiod (startdate, enddate, coins) values
('2018-01-01','2018-01-31', 80)
, ('2018-01-07','2018-01-10', 10)
, ('2018-01-07','2018-01-31', 10)
, ('2018-01-11','2018-01-31', 5)
, ('2018-01-25','2018-01-27', 5)
, ('2018-02-02','2018-02-23', 100)
;
WITH changes AS (
SELECT startdate AS tickdate , coins
, 1 AS cover
FROM coinsonperiod
UNION ALL
-- add 1 day to convert to half-open intervals
SELECT 1+enddate AS tickdate, -1* coins
, -1 AS cover
FROM coinsonperiod
)
, sumchanges AS (
SELECT tickdate, SUM(coins) AS change, SUM(cover) AS cover
FROM changes
GROUP BY tickdate
)
, aggregated AS (
SELECT
tickdate AS startdate
, lead(tickdate) over www AS enddate
, sum(change) OVER www AS cash
-- number of covered intervals
, sum(cover) OVER www AS cover
FROM sumchanges
WINDOW www AS (ORDER BY tickdate)
)
-- substract one day from enddate to correct back to closed intervals
SELECT startdate, enddate-1 AS enddate, cash, cover
FROM aggregated
WHERE cover > 0
ORDER BY startdate
;
正确答案:
这是我的测试数据
drop table coinsonperiod2;
create table coinsonperiod2(
id serial,
startdate date,
enddate date,
coins integer,
userid integer
);
insert into coinsonperiod2 (startdate, enddate, coins,userid) values
('2018-01-01','2018-01-31', 80,1)
, ('2018-01-07','2018-01-10', 10,1)
, ('2018-01-07','2018-01-31', 10,1)
, ('2018-01-11','2018-01-31', 5,1)
, ('2018-01-25','2018-01-27', 5,1)
, ('2018-02-02','2018-02-23', 100,2)
, ('2018-01-01','2018-01-31', 80,2)
, ('2018-01-07','2018-01-10', 10,2)
, ('2018-01-07','2018-01-31', 10,2)
, ('2018-01-11','2018-01-31', 5,2)
, ('2018-01-25','2018-01-27', 5,2)
, ('2018-02-02','2018-02-23', 100,3)
;
更新 2:
现在我明白了开区间、半开区间和闭区间的区别。在 joops 解决方案中,必须针对半开区间进行计算,但我想要的结果是闭区间。这就是为什么必须减少 enddate 以使结果成为闭区间的原因。如果我错了请纠正我。
我还在示例数据中添加了 userid 并进一步修改了 joops 解决方案。
这是给我想要的结果的查询。
with changes AS (
SELECT
userid,
startdate AS tickdate,
coins,
1 AS cover
FROM coinsonperiod2
UNION ALL
-- add 1 day to correct intervals into half open intervals, so the calculation is correct
SELECT
userid,
1 + enddate AS tickdate,
-1 * coins,
-1 AS cover
FROM coinsonperiod2
)
, sumchanges AS (
SELECT
userid,
tickdate,
SUM(coins) AS change,
SUM(cover) AS cover
FROM changes
GROUP BY tickdate, userid
)
, aggregated AS (
SELECT
userid AS userid,
tickdate AS startdate,
lead(tickdate)
over www AS enddate,
sum(change)
OVER www AS cash,
sum(cover)
OVER www AS cover
FROM sumchanges
WINDOW www AS (
partition by userid
ORDER BY tickdate )
)
-- reduce 1 day from the enddate to make closed interval
SELECT
userid
, startdate
, enddate-1 as enddate
, cash
, cover
FROM aggregated
WHERE cover > 0
ORDER BY userid, startdate
;
结果:
我需要从重叠日期列表中获取彼此不重叠的日期范围列表,并获取重叠期间的硬币总和。我试过用谷歌搜索一个例子,但到目前为止还没有运气。我可能没有使用正确的关键词?
我有一个重叠日期列表
1.1.2018 - 31.1.2018 80
7.1.2018 - 10.1.2018 10
7.1.2018 - 31.1.2018 10
11.1.2018 - 31.1.2018 5
25.1.2018 - 27.1.2018 5
2.2.2018 - 23.2.2018 100
期望的结果是
1.1.2018 - 6.7.2018 80 coins
7.1.2018 - 10.1.2018 100 coins
11.1.2018 - 24.1.2018 95 coins
25.1.2018 - 27.1.2018 100 coins
28.1.2018 - 31.1.2018 95 coins
2.2.2018 - 23.2.2018 100 coins
这是它应该如何工作的图
|------------------------------|
|---|
|-----------------------|
|-------------------|
|---|
|----------------------|
Outcome
|------|---|----------|---|----| |----------------------|
80 100 95 100 95 100
这是我的测试数据
drop table coinsonperiod2;
create table coinsonperiod2(
id serial,
startdate date,
enddate date,
coins integer,
userid integer
);
insert into coinsonperiod2 (startdate, enddate, coins,userid) values
('2018-01-01','2018-01-31', 80,1)
, ('2018-01-07','2018-01-10', 10,1)
, ('2018-01-07','2018-01-31', 10,1)
, ('2018-01-11','2018-01-31', 5,1)
, ('2018-01-25','2018-01-27', 5,1)
, ('2018-02-02','2018-02-23', 100,2)
, ('2018-01-01','2018-01-31', 80,2)
, ('2018-01-07','2018-01-10', 10,2)
, ('2018-01-07','2018-01-31', 10,2)
, ('2018-01-11','2018-01-31', 5,2)
, ('2018-01-25','2018-01-27', 5,2)
, ('2018-02-02','2018-02-23', 100,3)
;
更新: 实际上 StephenM 和 joops 的答案不符合我想要的结果。两个答案都显示结束日期错误。
当一个时期结束时,下一个时期应该在第二天开始(如果有间隔,则更晚)。在我想要的结果中,1.1.2018-6.1.2018 包括第 6 天。 6th和7th之间没有差距,因为7th包含在7.1.2018-10.1.2018.
更新 2: 现在我明白了开区间、半开区间和闭区间的区别。在 joops 解决方案中,必须针对半开区间进行计算,但我想要的结果是闭区间。这就是为什么必须减少 enddate 以使结果成为闭区间的原因。如果我错了请纠正我。
我还在示例数据中添加了userid,并进一步修改了joops解决方案。 这是给我想要的结果的查询。
with changes AS (
SELECT
userid,
startdate AS tickdate,
coins,
1 AS cover
FROM coinsonperiod2
UNION ALL
-- add 1 day to correct intervals into half open intervals, so the calculation is correct
SELECT
userid,
1 + enddate AS tickdate,
-1 * coins,
-1 AS cover
FROM coinsonperiod2
)
, sumchanges AS (
SELECT
userid,
tickdate,
SUM(coins) AS change,
SUM(cover) AS cover
FROM changes
GROUP BY tickdate, userid
)
, aggregated AS (
SELECT
userid AS userid,
tickdate AS startdate,
lead(tickdate)
over www AS enddate,
sum(change)
OVER www AS cash,
sum(cover)
OVER www AS cover
FROM sumchanges
WINDOW www AS (
partition by userid
ORDER BY tickdate )
)
-- reduce 1 day from the enddate to make closed interval
SELECT
userid
, startdate
, enddate-1 as enddate
, cash
, cover
FROM aggregated
WHERE cover > 0
ORDER BY userid, startdate
;
结果:
好的,所以我将帮助您了解逻辑位,您可以在网上找到的语法。
您可以做的是创建一个临时文件 table 并将数据移到那里,然后 select 每行数据和每列数据将值存储在声明的变量中。
然后简单地使用一个游标,select 来自您的来源的所有数据 table 并再次使用一个正常的大于或小于运算符,并按照您的方式计算。
简单获取第 1 行第 1 列,与所有其他第 1 列和第 2 列数据进行比较。
看来我找到了一个丑陋的作品
select t1.dt, t1.enddt, sum(coins)
from (
select distinct cp1.dt, min(cp2.dt) enddt
from ( select startdate as dt from coinsonperiod union all select enddate as dt from coinsonperiod ) cp1,
( select startdate as dt from coinsonperiod union all select enddate as dt from coinsonperiod ) cp2
where cp2.dt > cp1.dt
group by cp1.dt
order by cp1.dt ) t1, coinsonperiod t2
where t1.dt between t2.startdate and t2.enddate
and t1.enddt between t2.startdate and t2.enddate
group by t1.dt, t1.enddt
输出:
dt |enddt |sum |
-----------|-----------|----|
2018-01-01 |2018-01-07 |80 |
2018-01-07 |2018-01-10 |100 |
2018-01-10 |2018-01-11 |90 |
2018-01-11 |2018-01-25 |95 |
2018-01-25 |2018-01-27 |100 |
2018-01-27 |2018-01-31 |95 |
2018-02-02 |2018-02-23 |100 |
与你的输出唯一不同的是我想你忘记了 01/10 和 01/11 之间的间隔
逻辑是:
- 在间隔的开始 将其值添加到累计总和
- 在间隔的 结束时 从这个总和中减去它的值
- 但是为了扫除日期变更线,我们必须收集所有(独特的)date/time 邮票,无论是开始还是停止。
所以重点是:将数据从一系列 intervals 转换为一系列 (start/stop) events ,并汇总这些。
-- \i tmp.sql
create table coinsonperiod(
id serial,
startdate date,
enddate date,
coins integer
);
insert into coinsonperiod (startdate, enddate, coins) values
('2018-01-01','2018-01-31', 80)
, ('2018-01-07','2018-01-10', 10)
, ('2018-01-07','2018-01-31', 10)
, ('2018-01-11','2018-01-31', 5)
, ('2018-01-25','2018-01-27', 5)
, ('2018-02-02','2018-02-23', 100)
;
WITH changes AS (
SELECT startdate AS tickdate , coins
, 1 AS cover
FROM coinsonperiod
UNION ALL
-- add 1 day to convert to half-open intervals
SELECT 1+enddate AS tickdate, -1* coins
, -1 AS cover
FROM coinsonperiod
)
, sumchanges AS (
SELECT tickdate, SUM(coins) AS change, SUM(cover) AS cover
FROM changes
GROUP BY tickdate
)
, aggregated AS (
SELECT
tickdate AS startdate
, lead(tickdate) over www AS enddate
, sum(change) OVER www AS cash
-- number of covered intervals
, sum(cover) OVER www AS cover
FROM sumchanges
WINDOW www AS (ORDER BY tickdate)
)
-- substract one day from enddate to correct back to closed intervals
SELECT startdate, enddate-1 AS enddate, cash, cover
FROM aggregated
WHERE cover > 0
ORDER BY startdate
;
正确答案:
这是我的测试数据
drop table coinsonperiod2;
create table coinsonperiod2(
id serial,
startdate date,
enddate date,
coins integer,
userid integer
);
insert into coinsonperiod2 (startdate, enddate, coins,userid) values
('2018-01-01','2018-01-31', 80,1)
, ('2018-01-07','2018-01-10', 10,1)
, ('2018-01-07','2018-01-31', 10,1)
, ('2018-01-11','2018-01-31', 5,1)
, ('2018-01-25','2018-01-27', 5,1)
, ('2018-02-02','2018-02-23', 100,2)
, ('2018-01-01','2018-01-31', 80,2)
, ('2018-01-07','2018-01-10', 10,2)
, ('2018-01-07','2018-01-31', 10,2)
, ('2018-01-11','2018-01-31', 5,2)
, ('2018-01-25','2018-01-27', 5,2)
, ('2018-02-02','2018-02-23', 100,3)
;
更新 2: 现在我明白了开区间、半开区间和闭区间的区别。在 joops 解决方案中,必须针对半开区间进行计算,但我想要的结果是闭区间。这就是为什么必须减少 enddate 以使结果成为闭区间的原因。如果我错了请纠正我。
我还在示例数据中添加了 userid 并进一步修改了 joops 解决方案。 这是给我想要的结果的查询。
with changes AS (
SELECT
userid,
startdate AS tickdate,
coins,
1 AS cover
FROM coinsonperiod2
UNION ALL
-- add 1 day to correct intervals into half open intervals, so the calculation is correct
SELECT
userid,
1 + enddate AS tickdate,
-1 * coins,
-1 AS cover
FROM coinsonperiod2
)
, sumchanges AS (
SELECT
userid,
tickdate,
SUM(coins) AS change,
SUM(cover) AS cover
FROM changes
GROUP BY tickdate, userid
)
, aggregated AS (
SELECT
userid AS userid,
tickdate AS startdate,
lead(tickdate)
over www AS enddate,
sum(change)
OVER www AS cash,
sum(cover)
OVER www AS cover
FROM sumchanges
WINDOW www AS (
partition by userid
ORDER BY tickdate )
)
-- reduce 1 day from the enddate to make closed interval
SELECT
userid
, startdate
, enddate-1 as enddate
, cash
, cover
FROM aggregated
WHERE cover > 0
ORDER BY userid, startdate
;
结果: