根据另一列计算列的总和

calculate sum of a column based on another col

我的 df 是这样的:

value    type
12       x
34       z
54       x
14       y

我想创建一个新列 df.sum,我想在其中对值 col 求和,但仅在 type == x 处求和。其余行应为空。例如,输出应该是这样的:

value    type    sum
12       x       86
34       z
54       x       86
14       y

如果要处理单一类型(仅x):

mask = df['type'].eq('x')
df.loc[mask, 'sum'] = df.loc[mask, 'value'].sum()

如果您可能需要处理几个:

types = ['x'] # add others, e.g.: types = ['x', 'y']
df.loc[df['type'].isin(types), 'sum'] = (df.groupby('type')['value']
                                           .transform('sum')
                                         )

输出:

   value type   sum
0     12    x  66.0
1     34    z   NaN
2     54    x  66.0
3     14    y   NaN

是的,它看起来很奇怪但仍然有效:

types = ['2','x']  # your keys to sum

df = df.merge(df.query('type in @types').
              groupby('type', as_index=False).
              agg(sum), 
              how='left', on='type', suffixes=(None,'_sum'))
'''
   value type  value_sum
0     12    x       66.0
1     34    z        NaN
2     54    x       66.0
3     14    y        NaN