计算 pandas 中每组的数值差异
Calculating numeric differences per group in pandas
我的 Dataframe 具有以下结构:
patient_id | timestamp | measurement
A | 2014-10-10 | 5.7
A | 2014-10-11 | 6.3
B | 2014-10-11 | 6.1
B | 2014-10-10 | 4.1
我想计算每位患者每次测量之间的 delta
(差异)。
结果应如下所示:
patient_id | timestamp | measurement | delta
A | 2014-10-10 | 5.7 | NaN
A | 2014-10-11 | 6.3 | 0.6
B | 2014-10-11 | 6.1 | 2.0
B | 2014-10-10 | 4.1 | NaN
如何在 pandas 中最优雅地完成这项工作?
调用transform
on the 'measurement' column and pass the method diff
,转换returns一个索引与原始df对齐的序列:
In [4]:
df['delta'] = df.groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[4]:
patient_id timestamp measurement delta
0 A 2014-10-10 5.7 NaN
1 A 2014-10-11 6.3 0.6
2 B 2014-10-10 4.1 NaN
3 B 2014-10-11 6.1 2.0
编辑
如果您打算对 transform
的结果进行排序,那么首先对 df 进行排序:
In [10]:
df['delta'] = df.sort(columns=['patient_id', 'timestamp']).groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[10]:
patient_id timestamp measurement delta
0 A 2014-10-10 5.7 NaN
1 A 2014-10-11 6.3 0.6
2 B 2014-10-11 6.1 2.0
3 B 2014-10-10 4.1 NaN
我的 Dataframe 具有以下结构:
patient_id | timestamp | measurement
A | 2014-10-10 | 5.7
A | 2014-10-11 | 6.3
B | 2014-10-11 | 6.1
B | 2014-10-10 | 4.1
我想计算每位患者每次测量之间的 delta
(差异)。
结果应如下所示:
patient_id | timestamp | measurement | delta
A | 2014-10-10 | 5.7 | NaN
A | 2014-10-11 | 6.3 | 0.6
B | 2014-10-11 | 6.1 | 2.0
B | 2014-10-10 | 4.1 | NaN
如何在 pandas 中最优雅地完成这项工作?
调用transform
on the 'measurement' column and pass the method diff
,转换returns一个索引与原始df对齐的序列:
In [4]:
df['delta'] = df.groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[4]:
patient_id timestamp measurement delta
0 A 2014-10-10 5.7 NaN
1 A 2014-10-11 6.3 0.6
2 B 2014-10-10 4.1 NaN
3 B 2014-10-11 6.1 2.0
编辑
如果您打算对 transform
的结果进行排序,那么首先对 df 进行排序:
In [10]:
df['delta'] = df.sort(columns=['patient_id', 'timestamp']).groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[10]:
patient_id timestamp measurement delta
0 A 2014-10-10 5.7 NaN
1 A 2014-10-11 6.3 0.6
2 B 2014-10-11 6.1 2.0
3 B 2014-10-10 4.1 NaN