pandas 中的 groupby 函数后有没有办法得到多个平均值?

Is there a way to get multiple averages after groupby function in pandas?

我有以下 pandas 时间序列数据框:

Index   Time    Centre position X   Centre position Y   Datafile    Group   Zone    Timeframe   dV
8789    1257.318    180.0   201.0   CHR1    CHR Zone A  Before stimulation  15.625000
8790    1257.462    181.0   195.0   CHR1    CHR Zone A  Before stimulation  42.241406
8791    1257.590    184.0   188.0   CHR1    CHR Zone A  Before stimulation  59.498227
8792    1257.718    187.0   184.0   CHR1    CHR Zone B  Before stimulation  39.062500
8793    1257.862    190.0   176.0   CHR1    CHR Zone B  Before stimulation  59.333359
8794    1257.927    190.0   173.0   CHR1    CHR Zone A  Before stimulation  46.153846
8795    1258.054    192.0   171.0   CHR1    CHR Zone A  Before stimulation  22.271080
8796    1258.198    192.0   172.0   CHR1    CHR Zone C  After stimulation   6.944444
8797    1258.326    192.0   171.0   CHR1    CHR Zone C  After stimulation   7.812500
8798    1258.454    191.0   169.0   CHR1    CHR Zone A  After stimulation   17.469281
8799    1258.598    191.0   168.0   CHR1    CHR Zone A  After stimulation   6.944444
8800    1258.726    192.0   165.0   CHR1    CHR Zone A  After stimulation   24.705294

我想提取按 TimeframeZone 分组的平均速度 (dV),但是,由于数据是连续的,我想获得给定时间范围内的多个平均值,并且区。我想不出一个优雅的方法来实现这一点,因为 groupby 平均所有值并输出一个值。

预期输出为:

非常感谢您!

您需要做的第一件事是创建一个引用列。 一个非常幼稚的方式就像

df.loc[:,'Zone_shift']=df.loc[:,'Zone'].shift(1)
df.loc[:,'Timeframe_shift']=df.loc[:,'Timeframe'].shift(1)
df.loc[:,'Groupby'] = df.apply(lambda x: 0 if x['Zone']==x['Zone_shift'] and x['Timeframe']==x['Timeframe_shift'] else 1, axis=1)
df.loc[:,'Groupby'] = df.loc[:,'Groupby'].cumsum()

添加参考数据后,dataframe是这样的

    Zone    Timeframe               dV  Zone_shift  Timeframe_shift Groupby
0   ZoneA   Beforestimulation   15.625    nan        nan            1
1   ZoneA   Beforestimulation   42.241  ZoneA   Beforestimulation   1
2   ZoneA   Beforestimulation   59.498  ZoneA   Beforestimulation   1
3   ZoneB   Beforestimulation   39.062  ZoneA   Beforestimulation   2
4   ZoneB   Beforestimulation   59.333  ZoneB   Beforestimulation   2
5   ZoneA   Beforestimulation   46.153  ZoneB   Beforestimulation   3
6   ZoneA   Beforestimulation   22.271  ZoneA   Beforestimulation   3
7   ZoneC   Afterstimulation    6.9444  ZoneA   Beforestimulation   4
8   ZoneC   Afterstimulation    7.8125  ZoneC   Afterstimulation    4
9   ZoneA   Afterstimulation    17.469  ZoneC   Afterstimulation    5
10  ZoneA   Afterstimulation    6.9444  ZoneA   Afterstimulation    5
11  ZoneA   Afterstimulation    24.705  ZoneA   Afterstimulation    5

那你只需要groupby

df.groupby(['Groupby','Zone','Timeframe']).mean()

最终输出会像

Groupby Zone    Timeframe   dV
1   ZoneA   Beforestimulation   39.12154433333333
2   ZoneB   Beforestimulation   49.1979295
3   ZoneA   Beforestimulation   34.212463
4   ZoneC   Afterstimulation    7.378472
5   ZoneA   Afterstimulation    16.373006333333333