Pandas groupby -> 聚合 - 两列的函数
Pandas groupby -> aggregate - function of two columns
我正在使用 pandas aggregate
如下:
In [6]: gb = df.groupby(['col1', 'col2'])
...: counts = gb.size().to_frame(name='counts')
...: (counts
...: .join(gb.agg({'col3': 'mean'}).rename(columns={'col3': 'col3_mean'}))
...: .join(gb.agg({'col4': 'median'}).rename(columns={'col4': 'col4_median'}))
...: .join(gb.agg({'col4': 'min'}).rename(columns={'col4': 'col4_min'}))
...: .reset_index()
...: )
如何再添加一列来包含值的总和 col3 * col4
?
首先在 groupby
之前创建列 new
然后聚合 sum
,您在命名聚合中重写的解决方案是:
counts = (df.assign(new = df['col3'] * df['col4'])
.groupby(['col1', 'col2'], as_index=False)
.agg(counts=('col1','size'),
col3_mean=('col3','mean'),
col4_median=('col4','median'),
col4_min=('col4','min'),
both_sum=('new','sum')))
我正在使用 pandas aggregate
如下:
In [6]: gb = df.groupby(['col1', 'col2'])
...: counts = gb.size().to_frame(name='counts')
...: (counts
...: .join(gb.agg({'col3': 'mean'}).rename(columns={'col3': 'col3_mean'}))
...: .join(gb.agg({'col4': 'median'}).rename(columns={'col4': 'col4_median'}))
...: .join(gb.agg({'col4': 'min'}).rename(columns={'col4': 'col4_min'}))
...: .reset_index()
...: )
如何再添加一列来包含值的总和 col3 * col4
?
首先在 groupby
之前创建列 new
然后聚合 sum
,您在命名聚合中重写的解决方案是:
counts = (df.assign(new = df['col3'] * df['col4'])
.groupby(['col1', 'col2'], as_index=False)
.agg(counts=('col1','size'),
col3_mean=('col3','mean'),
col4_median=('col4','median'),
col4_min=('col4','min'),
both_sum=('new','sum')))