如何从日期时间索引中每月切出第 n 天?
How to slice day n each month from datetime index?
我有一个如下所示的索引:
DatetimeIndex(['2005-01-03', '2005-01-04', '2005-01-05', '2005-01-07',
'2005-01-10', '2005-01-11', '2005-01-12', '2005-01-13',
'2005-01-14', '2005-01-17',
...
'2021-11-18', '2021-11-19', '2021-11-22', '2021-11-23',
'2021-11-24', '2021-11-25', '2021-11-26', '2021-11-29',
'2021-11-30', '2021-12-01'],
在索引中为每个月分割第 n 天的最佳方法是什么?如果第 n 天在一个月内不存在,则应包括索引中的下一天。
IIUC,计算每一天与第N天的差值后使用resample
:
设置一个MRE:
dti = pd.date_range('2021', '2022', freq='6D')
df = pd.DataFrame({'A': np.random.random(len(dti))}, index=dti)
print(df)
# Output:
A
2021-01-01 0.771107
2021-01-07 0.513512
2021-01-13 0.613989
2021-01-19 0.842237
2021-01-25 0.056986
... ...
2021-12-03 0.002022
2021-12-09 0.943162
2021-12-15 0.665152
2021-12-21 0.350673
2021-12-27 0.109381
[61 rows x 1 columns]
Select 右行
# N = 12
>>> df.loc[df.assign(diff=abs(N - df.index.day)).resample('M')['diff'].idxmin().values]
A
2021-01-13 0.613989 # nearest day
2021-02-12 0.847639 # day N
2021-03-14 0.356432 # nearest day
2021-04-13 0.279536 # nearest day
2021-05-13 0.628562 # nearest day
2021-06-12 0.859800 # day N
2021-07-12 0.112848 # day N
2021-08-11 0.933266 # nearest day
2021-09-10 0.484271 # nearest day
2021-10-10 0.289343 # nearest day
2021-11-09 0.963057 # nearest day
2021-12-09 0.943162 # nearest day
我有一个如下所示的索引:
DatetimeIndex(['2005-01-03', '2005-01-04', '2005-01-05', '2005-01-07',
'2005-01-10', '2005-01-11', '2005-01-12', '2005-01-13',
'2005-01-14', '2005-01-17',
...
'2021-11-18', '2021-11-19', '2021-11-22', '2021-11-23',
'2021-11-24', '2021-11-25', '2021-11-26', '2021-11-29',
'2021-11-30', '2021-12-01'],
在索引中为每个月分割第 n 天的最佳方法是什么?如果第 n 天在一个月内不存在,则应包括索引中的下一天。
IIUC,计算每一天与第N天的差值后使用resample
:
设置一个MRE:
dti = pd.date_range('2021', '2022', freq='6D')
df = pd.DataFrame({'A': np.random.random(len(dti))}, index=dti)
print(df)
# Output:
A
2021-01-01 0.771107
2021-01-07 0.513512
2021-01-13 0.613989
2021-01-19 0.842237
2021-01-25 0.056986
... ...
2021-12-03 0.002022
2021-12-09 0.943162
2021-12-15 0.665152
2021-12-21 0.350673
2021-12-27 0.109381
[61 rows x 1 columns]
Select 右行
# N = 12
>>> df.loc[df.assign(diff=abs(N - df.index.day)).resample('M')['diff'].idxmin().values]
A
2021-01-13 0.613989 # nearest day
2021-02-12 0.847639 # day N
2021-03-14 0.356432 # nearest day
2021-04-13 0.279536 # nearest day
2021-05-13 0.628562 # nearest day
2021-06-12 0.859800 # day N
2021-07-12 0.112848 # day N
2021-08-11 0.933266 # nearest day
2021-09-10 0.484271 # nearest day
2021-10-10 0.289343 # nearest day
2021-11-09 0.963057 # nearest day
2021-12-09 0.943162 # nearest day