如何按多个条件对pandas df进行分组,取均值追加到df?

How to group pandas df by multiple conditions, take the mean and append to df?

我有一个看起来像这样的 df:

df = pd.DataFrame({
    'Time' : [1,2,7,10,15,16,77,98,999,1000,1121,1245,1373,1490,1555],  
    'Act_cat' : [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4],
    'Count' : [6, 2, 4, 1, 2, 1, 8, 4, 3, 1, 4, 13, 3, 1, 2],
    'Moving': [1,0,1,0,1,0,1,0,1,0,1,0,1,0,1]})

我想按“Act_cat”中的相同值分组并按“移动”==1 对这些组取“计数”列的平均值并将其映射回df.

我已经尝试了下面的方法,但这里“计数”列的所有行都是平均的,而不仅仅是“移动”==1 的行。

group1 = (df['moving'].eq(1) & df['Act_cat'].diff().abs() > 0).cumsum()
mean_values = df.groupby(group1)["Count"].mean()
df['newcol'] = group1.map(mean_values)

请告诉我如何解决这个问题!

谢谢, 塔尼

IIUC 使用:

group1 = (df['Moving'].eq(1) & df['Act_cat'].diff().abs() > 0).cumsum()
mean_values = df[df['Moving'].eq(1)].groupby(group1)["Count"].mean()
df['newcol'] = group1.map(mean_values)

备选方案:

group1 = (df['Moving'].eq(1) & df['Act_cat'].diff().abs() > 0).cumsum()
df['newcol'] = df['Count'].where(df['Moving'].eq(1)).groupby(group1).transform('mean')

print (df)
    Time  Act_cat  Count  Moving  newcol
0      1        1      6       1     4.6
1      2        1      2       0     4.6
2      7        1      4       1     4.6
3     10        1      1       0     4.6
4     15        1      2       1     4.6
5     16        2      1       0     4.6
6     77        2      8       1     4.6
7     98        2      4       0     4.6
8    999        2      3       1     4.6
9   1000        2      1       0     4.6
10  1121        4      4       1     3.0
11  1245        4     13       0     3.0
12  1373        4      3       1     3.0
13  1490        4      1       0     3.0
14  1555        4      2       1     3.0