我想保存不同数据框列集的平均值(按行)并将它们存储在新数据框中
I want to save the mean (by row) of different set of dataframe columns and store them in a new dataframe
为此,我有一个列表列表(这是我的集群),例如:
asset_clusts=[[0,1],[3,5],[2,4, 12],...]
和原始数据框(在我的代码中我称之为 'x')如下:
return time series of s&p 500 companies
我想选择原始数据帧的第 [0,1] 列并计算它们的平均值(按行)并将其存储在新数据帧中,然后计算第 [3、5] 列的平均值,然后将其添加到新数据框,依此类推...
mu=pd.DataFrame()
for j in range(get_number_of_elements(asset_clusts)):
mu=x.iloc[:,asset_clusts[j]].mean(axis=1)
但是,它只给了我一列,我检查了一下,这一列是最后一组列的平均值
如有歧义,get_number_of_elements的函数为:
def get_number_of_elements(clist):
count = 0
for element in clist:
count += 1
return count
def get_number_of_elements(clust_list):
count = 0
for element in clust_list:
count += 1
return count
我解决了它,如果它对其他人有帮助,这是最终功能:
def clustered_series(x, org_asset_clust):
"""
x:return data
org_asset_clust: list of clusters
----> mean of each cluster returns by row
"""
def get_number_of_elements(org_asset_clust):
count = 0
for element in org_asset_clust:
count += 1
return count
mu=[]
for j in range(get_number_of_elements(org_asset_clust)):
mu.append(x.iloc[:,org_asset_clust[j]].mean(axis=1))
cluster_mean=pd.concat(mu, axis=1)
return cluster_mean
为此,我有一个列表列表(这是我的集群),例如:
asset_clusts=[[0,1],[3,5],[2,4, 12],...]
和原始数据框(在我的代码中我称之为 'x')如下: return time series of s&p 500 companies
我想选择原始数据帧的第 [0,1] 列并计算它们的平均值(按行)并将其存储在新数据帧中,然后计算第 [3、5] 列的平均值,然后将其添加到新数据框,依此类推...
mu=pd.DataFrame()
for j in range(get_number_of_elements(asset_clusts)):
mu=x.iloc[:,asset_clusts[j]].mean(axis=1)
但是,它只给了我一列,我检查了一下,这一列是最后一组列的平均值
如有歧义,get_number_of_elements的函数为:
def get_number_of_elements(clist):
count = 0
for element in clist:
count += 1
return count
def get_number_of_elements(clust_list):
count = 0
for element in clust_list:
count += 1
return count
我解决了它,如果它对其他人有帮助,这是最终功能:
def clustered_series(x, org_asset_clust):
"""
x:return data
org_asset_clust: list of clusters
----> mean of each cluster returns by row
"""
def get_number_of_elements(org_asset_clust):
count = 0
for element in org_asset_clust:
count += 1
return count
mu=[]
for j in range(get_number_of_elements(org_asset_clust)):
mu.append(x.iloc[:,org_asset_clust[j]].mean(axis=1))
cluster_mean=pd.concat(mu, axis=1)
return cluster_mean