我想保存不同数据框列集的平均值(按行)并将它们存储在新数据框中

I want to save the mean (by row) of different set of dataframe columns and store them in a new dataframe

为此,我有一个列表列表(这是我的集群),例如:

asset_clusts=[[0,1],[3,5],[2,4, 12],...]

和原始数据框(在我的代码中我称之为 'x')如下: return time series of s&p 500 companies

我想选择原始数据帧的第 [0,1] 列并计算它们的平均值(按行)并将其存储在新数据帧中,然后计算第 [3、5] 列的平均值,然后将其添加到新数据框,依此类推...

mu=pd.DataFrame() 
for j in range(get_number_of_elements(asset_clusts)):
    mu=x.iloc[:,asset_clusts[j]].mean(axis=1)

但是,它只给了我一列,我检查了一下,这一列是最后一组列的平均值

如有歧义,get_number_of_elements的函数为:

def get_number_of_elements(clist):
    count = 0
    for element in clist:
        count += 1
    return count
def get_number_of_elements(clust_list):
    count = 0
    for element in clust_list:
        count += 1
    return count

我解决了它,如果它对其他人有帮助,这是最终功能:

def clustered_series(x, org_asset_clust):
    """
    x:return data
    org_asset_clust: list of clusters
    ----> mean of each cluster returns by row
    """
    def get_number_of_elements(org_asset_clust):
        count = 0
        for element in org_asset_clust:
            count += 1
        return count
    mu=[]
    for j in range(get_number_of_elements(org_asset_clust)):
        mu.append(x.iloc[:,org_asset_clust[j]].mean(axis=1))
        cluster_mean=pd.concat(mu, axis=1)
        
    return cluster_mean