如何填写MySQL中两个日期之间的所有时间段?
How to fill in all periods between two dates in MySQL?
我遇到以下情况,我有一些人有开始日期和结束日期:
ID | start_date | end_date
1 2015-02-15 2015-04-20
2 2015-03-10 2015-06-15
... ... ...
现在,我需要得出一个 table 个人和 他们的开始日期和结束日期之间的所有连续 30 天时间段(从 start_date).结果应如下所示:
ID | period | from_date | to_date
1 1 2015-02-15 2015-03-17
1 2 2015-03-18 2015-04-17
2 1 2015-03-10 2015-04-09
2 2 2015-04-10 2015-05-10
2 3 2015-05-11 2015-06-10
你知道如何巧妙地在 MySQL 中创建这样一个 table 吗?如果 MySQL 对于这样的数据操作来说太麻烦,R 或 Excel 也适合我。
您可以生成一个数字范围,然后将该范围与所有记录交叉连接,向该行添加与返回的数字一样多的 30 天组。
类似这样的东西(没有测试所以请原谅任何打字错误):-
SELECT a.id, b.aNum, DATE_ADD(a.start_date, INTERVAL (b.aNum * 30) DAY) AS from_date, DATE_ADD(a.start_date, INTERVAL ((b.aNum + 1) * 30) DAY) AS to_date
FROM sometable a
CROSS JOIN
(
SELECT tens.aCnt * 10 + units.aCnt AS aNum
FROM
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) units
CROSS JOIN
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) tens
) b
WHERE DATE_ADD(a.start_date, INTERVAL (b.aNum * 30) DAY) <= end_date
此版本最多只能支持 100 组 30 天,但可以轻松扩展(但处理的组越多,速度越慢)
我必须说的棘手问题。
这是我在 R 中使用 data.table
包 的尝试。首先,我会确保您的数据中有正确的日期格式
library(data.table)
indx <- grep("date", names(df))
setDT(df)[, (indx) := lapply(.SD, as.Date), .SDcols = indx]
然后,我们将计算每个 ID30 天的间隔,同时将累积索引添加到开始和结束列
df[,
{
temp <- seq.Date(start_date, end_date, by = "30 days")
indx <- seq_along(temp[-(1L:2L)])
.(
Period = c(indx, length(temp) - 1L),
from = c(temp[1L], temp[-c(1L, length(temp))] + indx),
to = c(temp[2L], temp[-c(1L:2L)] + indx)
)
}
, by = ID]
# ID Period from to
# 1: 1 1 2015-02-15 2015-03-17
# 2: 1 2 2015-03-18 2015-04-17
# 3: 2 1 2015-03-10 2015-04-09
# 4: 2 2 2015-04-10 2015-05-10
# 5: 2 3 2015-05-11 2015-06-10
我稍微调整了 Kickstart 的代码来解决我对原始 post 的所有要求,也许它可以帮助遇到类似问题的人:
SELECT a.pid, b.aNum+1 as period, DATE_ADD(a.start_date, INTERVAL (b.aNum * 31) DAY) AS from_date,
DATE_ADD(DATE_ADD(a.start_date, INTERVAL (b.aNum * 31) DAY), INTERVAL 30 DAY) AS to_date
FROM any_table a
CROSS JOIN
(
SELECT hundreds.aCnt*100 + tens.aCnt * 10 + units.aCnt AS aNum
FROM
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) units
CROSS JOIN
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) tens
CROSS JOIN
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) hundreds
) b
WHERE DATE_ADD(a.start_date, INTERVAL (b.aNum * 30)+30 DAY) <= end_date
现在,下一个周期在上一个周期结束后一天开始,并且个人的最后 30 天周期在 end_date 之前结束。
我遇到以下情况,我有一些人有开始日期和结束日期:
ID | start_date | end_date
1 2015-02-15 2015-04-20
2 2015-03-10 2015-06-15
... ... ...
现在,我需要得出一个 table 个人和 他们的开始日期和结束日期之间的所有连续 30 天时间段(从 start_date).结果应如下所示:
ID | period | from_date | to_date
1 1 2015-02-15 2015-03-17
1 2 2015-03-18 2015-04-17
2 1 2015-03-10 2015-04-09
2 2 2015-04-10 2015-05-10
2 3 2015-05-11 2015-06-10
你知道如何巧妙地在 MySQL 中创建这样一个 table 吗?如果 MySQL 对于这样的数据操作来说太麻烦,R 或 Excel 也适合我。
您可以生成一个数字范围,然后将该范围与所有记录交叉连接,向该行添加与返回的数字一样多的 30 天组。
类似这样的东西(没有测试所以请原谅任何打字错误):-
SELECT a.id, b.aNum, DATE_ADD(a.start_date, INTERVAL (b.aNum * 30) DAY) AS from_date, DATE_ADD(a.start_date, INTERVAL ((b.aNum + 1) * 30) DAY) AS to_date
FROM sometable a
CROSS JOIN
(
SELECT tens.aCnt * 10 + units.aCnt AS aNum
FROM
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) units
CROSS JOIN
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) tens
) b
WHERE DATE_ADD(a.start_date, INTERVAL (b.aNum * 30) DAY) <= end_date
此版本最多只能支持 100 组 30 天,但可以轻松扩展(但处理的组越多,速度越慢)
我必须说的棘手问题。
这是我在 R 中使用 data.table
包 的尝试。首先,我会确保您的数据中有正确的日期格式
library(data.table)
indx <- grep("date", names(df))
setDT(df)[, (indx) := lapply(.SD, as.Date), .SDcols = indx]
然后,我们将计算每个 ID30 天的间隔,同时将累积索引添加到开始和结束列
df[,
{
temp <- seq.Date(start_date, end_date, by = "30 days")
indx <- seq_along(temp[-(1L:2L)])
.(
Period = c(indx, length(temp) - 1L),
from = c(temp[1L], temp[-c(1L, length(temp))] + indx),
to = c(temp[2L], temp[-c(1L:2L)] + indx)
)
}
, by = ID]
# ID Period from to
# 1: 1 1 2015-02-15 2015-03-17
# 2: 1 2 2015-03-18 2015-04-17
# 3: 2 1 2015-03-10 2015-04-09
# 4: 2 2 2015-04-10 2015-05-10
# 5: 2 3 2015-05-11 2015-06-10
我稍微调整了 Kickstart 的代码来解决我对原始 post 的所有要求,也许它可以帮助遇到类似问题的人:
SELECT a.pid, b.aNum+1 as period, DATE_ADD(a.start_date, INTERVAL (b.aNum * 31) DAY) AS from_date,
DATE_ADD(DATE_ADD(a.start_date, INTERVAL (b.aNum * 31) DAY), INTERVAL 30 DAY) AS to_date
FROM any_table a
CROSS JOIN
(
SELECT hundreds.aCnt*100 + tens.aCnt * 10 + units.aCnt AS aNum
FROM
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) units
CROSS JOIN
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) tens
CROSS JOIN
(SELECT 1 AS aCnt UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0) hundreds
) b
WHERE DATE_ADD(a.start_date, INTERVAL (b.aNum * 30)+30 DAY) <= end_date
现在,下一个周期在上一个周期结束后一天开始,并且个人的最后 30 天周期在 end_date 之前结束。