如何在 Pandas 中获取数据帧的移位索引值?

how to get the shifted index value of a dataframe in Pandas?

考虑下面的简单示例:

date = pd.date_range('1/1/2011', periods=5, freq='H')

df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
                         'B']}, index = date)
df
Out[278]: 
                    cat
2011-01-01 00:00:00   A
2011-01-01 01:00:00   A
2011-01-01 02:00:00   A
2011-01-01 03:00:00   B
2011-01-01 04:00:00   B

我想创建一个包含索引 lagged/lead 值的变量。那是这样的:

df['index_shifted']=df.index.shift(1)

因此,例如,在时间 2011-01-01 01:00:00 我希望变量 index_shifted2011-01-01 00:00:00

我该怎么做? 谢谢!

df['index_shifted']=df.index.shift(-1) 怎么了?

(正版问题,不知道有没有遗漏)

我认为你需要 Index.shift-1:

df['index_shifted']= df.index.shift(-1)
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00

对我来说,它可以在没有 freq 的情况下工作,但在实际数据中可能是必要的:

df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00

编辑:

如果DatetimeIndexfreqNone,您需要将freq添加到shift:

import pandas as pd

date = pd.date_range('1/1/2011', periods=5, freq='H').union(pd.date_range('5/1/2011', periods=5, freq='H'))


df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
                         'B','A', 'A', 'A', 'B',
                         'B']}, index = date)

print (df.index)
DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
               '2011-01-01 02:00:00', '2011-01-01 03:00:00',
               '2011-01-01 04:00:00', '2011-05-01 00:00:00',
               '2011-05-01 01:00:00', '2011-05-01 02:00:00',
               '2011-05-01 03:00:00', '2011-05-01 04:00:00'],
              dtype='datetime64[ns]', freq=None)

df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00
2011-05-01 00:00:00   A 2011-04-30 23:00:00
2011-05-01 01:00:00   A 2011-05-01 00:00:00
2011-05-01 02:00:00   A 2011-05-01 01:00:00
2011-05-01 03:00:00   B 2011-05-01 02:00:00
2011-05-01 04:00:00   B 2011-05-01 03:00:00