将 df 切成 windows 的 3Y 和 1M,日期范围为 Python
Slice a df into windows of 3Y and 1M with a date range Python
我有一个日期索引如下的 df:
ind = pd.date_range(start="2015-12-31", end = "2022-04-26", freq="D")
df = pd.DataFrame(
{
"col1": range(len(ind))
},
index=ind
)
我需要的是从 2017-08-31 到 3 年加 1 个月的每个月底将 windows 中的 df 切片,所以我有下一段代码
n = timedelta(365 * 3) + relativedelta(months=1)
fechas_ = pd.date_range("2017-08-31", ind.max() - n, freq="M")
# create a for loop to check the beginning and the end of each window
for i in fechas_:
print(f"start: {i}")
print(f"end: {i + n}")
print("\n")
我的问题是我需要每个月的最后一天作为每个月的结束 window 例如:
# first window
start: 2017-08-31 00:00:00
end: 2020-09-30 00:00:00
# second window
start: 2017-09-30 00:00:00
end: 2020-10-31 00:00:00
# so on
但我得到:
# first window
start: 2017-08-31 00:00:00
end: 2020-09-29 00:00:00
# second window
start: 2017-09-30 00:00:00
end: 2020-10-29 00:00:00
# 3
2017-10-31 00:00:00
2020-11-29 00:00:00
# 4
2017-11-30 00:00:00
2020-12-29 00:00:00
# 5
2017-12-31 00:00:00
2021-01-30 00:00:00
# 6
2018-01-31 00:00:00
2021-02-27 00:00:00
# 7
2018-02-28 00:00:00
2021-03-27 00:00:00
# 8
2018-03-31 00:00:00
2021-04-29 00:00:00
# 9
2018-04-30 00:00:00
2021-05-29 00:00:00
# 10
2018-05-31 00:00:00
2021-06-29 00:00:00
# 11
2018-06-30 00:00:00
2021-07-29 00:00:00
# 12
2018-07-31 00:00:00
2021-08-30 00:00:00
# 13
2018-08-31 00:00:00
2021-09-29 00:00:00
# 14
2018-09-30 00:00:00
2021-10-29 00:00:00
# 15
2018-10-31 00:00:00
2021-11-29 00:00:00
# 16
2018-11-30 00:00:00
2021-12-29 00:00:00
# 17
2018-12-31 00:00:00
2022-01-30 00:00:00
# 18
2019-01-31 00:00:00
2022-02-27 00:00:00
# 19
2019-02-28 00:00:00
2022-03-27 00:00:00
有人知道我该如何解决这个问题吗?
非常感谢
在你的代码中
n = timedelta(365 * 3) + relativedelta(months=1)
尝试将其替换为
n = relativedelta(years=3, months=1, day=31)
我有一个日期索引如下的 df:
ind = pd.date_range(start="2015-12-31", end = "2022-04-26", freq="D")
df = pd.DataFrame(
{
"col1": range(len(ind))
},
index=ind
)
我需要的是从 2017-08-31 到 3 年加 1 个月的每个月底将 windows 中的 df 切片,所以我有下一段代码
n = timedelta(365 * 3) + relativedelta(months=1)
fechas_ = pd.date_range("2017-08-31", ind.max() - n, freq="M")
# create a for loop to check the beginning and the end of each window
for i in fechas_:
print(f"start: {i}")
print(f"end: {i + n}")
print("\n")
我的问题是我需要每个月的最后一天作为每个月的结束 window 例如:
# first window
start: 2017-08-31 00:00:00
end: 2020-09-30 00:00:00
# second window
start: 2017-09-30 00:00:00
end: 2020-10-31 00:00:00
# so on
但我得到:
# first window
start: 2017-08-31 00:00:00
end: 2020-09-29 00:00:00
# second window
start: 2017-09-30 00:00:00
end: 2020-10-29 00:00:00
# 3
2017-10-31 00:00:00
2020-11-29 00:00:00
# 4
2017-11-30 00:00:00
2020-12-29 00:00:00
# 5
2017-12-31 00:00:00
2021-01-30 00:00:00
# 6
2018-01-31 00:00:00
2021-02-27 00:00:00
# 7
2018-02-28 00:00:00
2021-03-27 00:00:00
# 8
2018-03-31 00:00:00
2021-04-29 00:00:00
# 9
2018-04-30 00:00:00
2021-05-29 00:00:00
# 10
2018-05-31 00:00:00
2021-06-29 00:00:00
# 11
2018-06-30 00:00:00
2021-07-29 00:00:00
# 12
2018-07-31 00:00:00
2021-08-30 00:00:00
# 13
2018-08-31 00:00:00
2021-09-29 00:00:00
# 14
2018-09-30 00:00:00
2021-10-29 00:00:00
# 15
2018-10-31 00:00:00
2021-11-29 00:00:00
# 16
2018-11-30 00:00:00
2021-12-29 00:00:00
# 17
2018-12-31 00:00:00
2022-01-30 00:00:00
# 18
2019-01-31 00:00:00
2022-02-27 00:00:00
# 19
2019-02-28 00:00:00
2022-03-27 00:00:00
有人知道我该如何解决这个问题吗?
非常感谢
在你的代码中
n = timedelta(365 * 3) + relativedelta(months=1)
尝试将其替换为
n = relativedelta(years=3, months=1, day=31)