Pandas DatetimeIndex 奇怪的行为
Pandas DatetimeIndex strange behaviour
我处理一个DataFrame,索引是字符串,年月,例如:
index = ['2007-01', '2007-03', ...]
但是,索引未满。例如2007-02
丢失。
我想要的是用完整索引重新索引 DataFrame。
我尝试过的:
In [60]: pd.DatetimeIndex(start='2007-01', end='2007-12', freq='M')
Out[60]:
DatetimeIndex(['2007-01-31', '2007-02-28', '2007-03-31', '2007-04-30',
'2007-05-31', '2007-06-30', '2007-07-31', '2007-08-31',
'2007-09-30', '2007-10-31', '2007-11-30'],
dtype='datetime64[ns]', freq='M')
索引是每个月的月末。
In [64]: pd.DatetimeIndex(['2007-01', '2007-03', '2007-04', '2007-05'])
Out[64]: DatetimeIndex(['2007-01-01', '2007-03-01', '2007-04-01', '2007-05-01'], dtype='datetime64[ns]', freq=None)
索引是每个月的开始。
如何处理这个问题?
如果需要每月第一天的频率,我想你需要添加参数freq='MS'
:
print (pd.DatetimeIndex(start='2007-01', end='2007-12', freq='MS'))
DatetimeIndex(['2007-01-01', '2007-02-01', '2007-03-01', '2007-04-01',
'2007-05-01', '2007-06-01', '2007-07-01', '2007-08-01',
'2007-09-01', '2007-10-01', '2007-11-01', '2007-12-01'],
dtype='datetime64[ns]', freq='MS')
Link 到 Offset Aliases in pandas documentation, thank you .
另一种解决方案是使用 PeriodIndex
生成月份:
print (pd.PeriodIndex(start='2007-01', end='2007-12', freq='M'))
PeriodIndex(['2007-01', '2007-02', '2007-03', '2007-04', '2007-05', '2007-06',
'2007-07', '2007-08', '2007-09', '2007-10', '2007-11', '2007-12'],
dtype='int64', freq='M')
我处理一个DataFrame,索引是字符串,年月,例如:
index = ['2007-01', '2007-03', ...]
但是,索引未满。例如2007-02
丢失。
我想要的是用完整索引重新索引 DataFrame。
我尝试过的:
In [60]: pd.DatetimeIndex(start='2007-01', end='2007-12', freq='M')
Out[60]:
DatetimeIndex(['2007-01-31', '2007-02-28', '2007-03-31', '2007-04-30',
'2007-05-31', '2007-06-30', '2007-07-31', '2007-08-31',
'2007-09-30', '2007-10-31', '2007-11-30'],
dtype='datetime64[ns]', freq='M')
索引是每个月的月末。
In [64]: pd.DatetimeIndex(['2007-01', '2007-03', '2007-04', '2007-05'])
Out[64]: DatetimeIndex(['2007-01-01', '2007-03-01', '2007-04-01', '2007-05-01'], dtype='datetime64[ns]', freq=None)
索引是每个月的开始。
如何处理这个问题?
如果需要每月第一天的频率,我想你需要添加参数freq='MS'
:
print (pd.DatetimeIndex(start='2007-01', end='2007-12', freq='MS'))
DatetimeIndex(['2007-01-01', '2007-02-01', '2007-03-01', '2007-04-01',
'2007-05-01', '2007-06-01', '2007-07-01', '2007-08-01',
'2007-09-01', '2007-10-01', '2007-11-01', '2007-12-01'],
dtype='datetime64[ns]', freq='MS')
Link 到 Offset Aliases in pandas documentation, thank you
另一种解决方案是使用 PeriodIndex
生成月份:
print (pd.PeriodIndex(start='2007-01', end='2007-12', freq='M'))
PeriodIndex(['2007-01', '2007-02', '2007-03', '2007-04', '2007-05', '2007-06',
'2007-07', '2007-08', '2007-09', '2007-10', '2007-11', '2007-12'],
dtype='int64', freq='M')