按 n 个相邻日期 shift/lag df 中某些列中的数据的简单方法

Question

我有一个 df 并且想根据名称和日期时间索引向后移动特定列中的数据到之前的相邻日期（不一定是固定的天数），比如由于数据收集错误导致分数记录在错误的日期。

例如取原来的df:

Date	Name	Score
2020-01-01	John	9
2020-01-01	James	8
2020-03-05	John	6
2020-03-05	James	7
2020-07-20	John	5
2020-07-20	James	4

将 ['Score'] 移回一个相邻日期后的更正 df 如下所示：

Date	Name	Score
2020-01-01	John	6
2020-01-01	James	7
2020-03-05	John	5
2020-03-05	James	4
2020-07-20	John	NA
2020-07-20	James	NA

似乎有一个直接的解决方案，但我尝试制作数据框的副本并使用 += 1 但运行由于日期采用日期时间格式而遇到麻烦。

非常感谢您！

Answer 1

假设日期已排序。使用 groupby + shift:

df['prev_Score'] = df.groupby('Name')['Score'].shift(-1)

注意。为清楚起见，在此处作为新专栏。

输出：

         Date   Name  Score  prev_Score
0  2020-01-01   John      9         6.0
1  2020-01-01  James      8         7.0
2  2020-03-05   John      6         5.0
3  2020-03-05  James      7         4.0
4  2020-07-20   John      5         NaN
5  2020-07-20  James      4         NaN

按 n 个相邻日期 shift/lag df 中某些列中的数据的简单方法

Simple way to shift/lag data in certain columns in df by n adjacent dates

python

date

shift

lag

pandas