如果我们有超过 20 列，如何对数据框进行重新采样，并对每列应用不同的函数？

Question

我知道以前有人问过这个问题。答案如下：

df.resample('M').agg({'col1': np.sum, 'col2': np.mean})

但是我有 27 列，我想对前 25 列求和，然后对其余两列进行平均。我应该为 25 列写 this('col1' - 'col25': np.sum) 和 this('col26': np.mean, 'col27': np.mean) 两列？

Mt 数据框包含每小时数据，我想将其转换为每月数据。我想尝试类似的东西，但这是胡说八道：

for i in col_list:
    df = df.resample('M').agg({i-2: np.sum, 'col26': np.mean, 'col27': np.mean})

这种情况有什么捷径吗？

Answer 1

你可以试试这个，不是for循环:

sum_col = ['col1','col2','col3','col4', ...]
sum_df = df.resample('M')[sum_col].sum()

mean_col = ['col26','col27']
mean_df = df.resample('M')[mean_col].mean()

df = sum_col.join(mean_df)

如果我们有超过 20 列，如何对数据框进行重新采样，并对每列应用不同的函数？

How to resample a dataframe with different functions applied to each column if we have more than 20 columns?

resampling

dataframe