Pandas：使用 apply 对数据框上的行和列求和

Question

import datetime
import pandas as pd
import numpy as np

todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')

columns = ['A','B', 'C']
df = pd.DataFrame(index=index, columns=columns)
df = df.fillna(0) # with 0s rather than NaNs
data = np.array([np.arange(10)]*3).T
df = pd.DataFrame(data, index=index, columns=columns)

鉴于 df，我想按每个 'column' 分组并应用一个函数来计算每个日期的值之和除以那组（A、B、C）？

示例：

def total_calc(grp):
    sum_of_group = np.sum(group)
    return sum_of_group

我正在尝试以这种方式在我的数据框中使用 'apply' 函数，但 axis=1 仅适用于行和 轴=0 适用于列，我想获取每个组的两个数据点？

df.groupby(["A"]).apply(total_calc)

有什么想法吗？

Answer 1

我不确定你的问题所以我猜。首先我不喜欢使用整数值所以让我们将你的 df 转换为 float

df = df.astype(float)

如果您想将 A 列的每个元素除以 A 列的总和，反之亦然，您可以这样做：

df.div(df.sum(axis=0), axis=1)
Out[24]: 
                   A         B         C
2016-09-24  0.000000  0.000000  0.000000
2016-09-25  0.022222  0.022222  0.022222
2016-09-26  0.044444  0.044444  0.044444
2016-09-27  0.066667  0.066667  0.066667
2016-09-28  0.088889  0.088889  0.088889
2016-09-29  0.111111  0.111111  0.111111
2016-09-30  0.133333  0.133333  0.133333
2016-10-01  0.155556  0.155556  0.155556
2016-10-02  0.177778  0.177778  0.177778
2016-10-03  0.200000  0.200000  0.200000

Pandas：使用 apply 对数据框上的行和列求和

Pandas: Use apply to sum row and column on data frame

python

aggregation

pandas