如何使用 pandas 获取每组不同行的平均值?
How to get the mean values per different group of rows with pandas?
我有一个ascii文件如下
7.00000000 5.61921453
18.00000000 9.75818253
13.00000000 37.94074631
18.00000000 29.54162407
10.00000000 18.82115364
13.00000000 15.00485802
16.00000000 19.24893761
20.00000000 22.59035683
17.00000000 59.69598007
17.00000000 34.07574844
18.00000000 24.17820358
13.00000000 24.70093536
11.00000000 23.37569046
14.00000000 34.14352036
13.00000000 33.33922577
16.00000000 36.64311981
20.00000000 60.21446609
20.00000000 33.54150391
18.00000000 40.84828949
21.00000000 40.31245041
34.00000000 91.71004486
40.00000000 93.24317169
42.00000000 43.94712067
12.00000000 32.73310471
7.00000000 25.25534248
9.00000000 23.14623833
我想计算(分别针对两列)前 10 行的平均值,然后是接下来的 11 行,然后是接下来的 5 行,以获得以下输出
14.9 25.2296802
18 40.2734046
22 43.6649956
我如何在 python 和 pandas 中做到这一点?如果我有一组稳定的行(例如每 10 行),我会执行以下操作
df = pd.read_csv(i,sep='\t',header=None)
df_mean=df.groupby(np.arange(len(df))//10).mean()
使用numpy.repeat
制作任意长度的组(此处a/b/c):
import numpy as np
means = df.groupby(np.repeat(['a', 'b', 'c'], [10, 11, 5])).mean()
输出:
0 1
a 14.9 25.229680
b 18.0 40.273405
c 22.0 43.664996
如果您不关心群组名称:
groups = [10, 11, 5]
means = df.groupby(np.repeat(np.arange(len(groups)), groups)).mean()
输出:
0 1
0 14.9 25.229680
1 18.0 40.273405
2 22.0 43.664996
我有一个ascii文件如下
7.00000000 5.61921453
18.00000000 9.75818253
13.00000000 37.94074631
18.00000000 29.54162407
10.00000000 18.82115364
13.00000000 15.00485802
16.00000000 19.24893761
20.00000000 22.59035683
17.00000000 59.69598007
17.00000000 34.07574844
18.00000000 24.17820358
13.00000000 24.70093536
11.00000000 23.37569046
14.00000000 34.14352036
13.00000000 33.33922577
16.00000000 36.64311981
20.00000000 60.21446609
20.00000000 33.54150391
18.00000000 40.84828949
21.00000000 40.31245041
34.00000000 91.71004486
40.00000000 93.24317169
42.00000000 43.94712067
12.00000000 32.73310471
7.00000000 25.25534248
9.00000000 23.14623833
我想计算(分别针对两列)前 10 行的平均值,然后是接下来的 11 行,然后是接下来的 5 行,以获得以下输出
14.9 25.2296802
18 40.2734046
22 43.6649956
我如何在 python 和 pandas 中做到这一点?如果我有一组稳定的行(例如每 10 行),我会执行以下操作
df = pd.read_csv(i,sep='\t',header=None)
df_mean=df.groupby(np.arange(len(df))//10).mean()
使用numpy.repeat
制作任意长度的组(此处a/b/c):
import numpy as np
means = df.groupby(np.repeat(['a', 'b', 'c'], [10, 11, 5])).mean()
输出:
0 1
a 14.9 25.229680
b 18.0 40.273405
c 22.0 43.664996
如果您不关心群组名称:
groups = [10, 11, 5]
means = df.groupby(np.repeat(np.arange(len(groups)), groups)).mean()
输出:
0 1
0 14.9 25.229680
1 18.0 40.273405
2 22.0 43.664996