计算每行的平均值
Counting means from each row
我正在使用 Jupyter notebook 获取此数据:
df = ascii.read ('http://www.astrouw.edu.pl/cgi-asas/asas_cgi_get_data?110545-5433.5,asas3')
第 2-6 列是量级。我想连续删除每 5 种可能性的最小值和最大值,并计算剩余三个量级的平均值。
我该怎么做?
以及如何重命名列?
一种方法是将您的 table 转换为 pandas 数据框,然后计算居中平均值:
import pandas as pd
import numpy as np
import io
from io import StringIO
from astropy.io import ascii
df = ascii.read ('http://www.astrouw.edu.pl/cgi-asas/asas_cgi_get_data?110545-5433.5,asas3')
ascii.write(df,'values1.csv', format='csv', fast_writer=False)
DF = pd.read_csv('values1.csv', sep=",")
DF_small = DF[['col2','col3','col4','col5','col6']]
DF_small['centered mean'] = DF_small.apply(lambda x: x.drop([x.idxmax(),x.idxmin()]).mean() , axis = 1)
如您所见,我们删除了最大值和最小值。这个returns
col2 col3 col4 col5 col6 centered mean
0 11.630 11.623 11.640 11.648 11.658 11.639333
1 11.552 11.541 11.592 11.627 11.629 11.590333
2 11.704 11.739 11.697 11.683 11.699 11.700000
3 11.688 11.693 11.718 11.752 11.728 11.713000
4 11.654 11.677 11.680 11.702 11.711 11.686333
.. ... ... ... ... ... ...
626 11.613 11.564 11.631 11.632 11.613 11.619000
627 11.672 11.618 11.683 11.688 11.698 11.681000
628 11.654 11.614 11.670 11.672 11.663 11.662333
629 11.536 11.524 11.559 11.571 11.569 11.554667
630 11.647 11.641 11.660 11.664 11.673 11.657000
更新:
如果您想保留所有其他列:
DF['centered mean'] = DF[['col2','col3','col4','col5','col6']].apply(lambda x: x.drop([x.idxmax(),x.idxmin()]).mean() , axis = 1)
给出:
col1 col2 col3 col4 col5 col6 col7 col8 col9 \
0 3116.55170 11.630 11.623 11.640 11.648 11.658 0.054 0.054 0.057
1 1885.76359 11.552 11.541 11.592 11.627 11.629 0.037 0.049 0.035
2 1888.79953 11.704 11.739 11.697 11.683 11.699 0.044 0.064 0.045
3 1899.81685 11.688 11.693 11.718 11.752 11.728 0.039 0.046 0.032
4 1915.74475 11.654 11.677 11.680 11.702 11.711 0.038 0.054 0.037
.. ... ... ... ... ... ... ... ... ...
626 3385.79064 11.613 11.564 11.631 11.632 11.613 0.041 0.051 0.040
627 3391.76317 11.672 11.618 11.683 11.688 11.698 0.035 0.048 0.032
628 3448.69080 11.654 11.614 11.670 11.672 11.663 0.038 0.053 0.036
629 3450.66562 11.536 11.524 11.559 11.571 11.569 0.047 0.068 0.041
630 3452.79152 11.647 11.641 11.660 11.664 11.673 0.044 0.066 0.040
col10 col11 col12 col13 centered mean
0 0.067 0.079 B 99788 11.639333
1 0.042 0.054 A 2479 11.590333
2 0.054 0.064 A 2974 11.700000
3 0.039 0.049 A 3272 11.713000
4 0.044 0.054 A 4974 11.686333
.. ... ... ... ... ...
626 0.045 0.052 A 126087 11.619000
627 0.037 0.045 A 126876 11.681000
628 0.042 0.049 A 134925 11.662333
629 0.045 0.049 B 135255 11.554667
630 0.048 0.054 A 135545 11.657000
[631 rows x 14 columns]
我正在使用 Jupyter notebook 获取此数据:
df = ascii.read ('http://www.astrouw.edu.pl/cgi-asas/asas_cgi_get_data?110545-5433.5,asas3')
第 2-6 列是量级。我想连续删除每 5 种可能性的最小值和最大值,并计算剩余三个量级的平均值。
我该怎么做? 以及如何重命名列?
一种方法是将您的 table 转换为 pandas 数据框,然后计算居中平均值:
import pandas as pd
import numpy as np
import io
from io import StringIO
from astropy.io import ascii
df = ascii.read ('http://www.astrouw.edu.pl/cgi-asas/asas_cgi_get_data?110545-5433.5,asas3')
ascii.write(df,'values1.csv', format='csv', fast_writer=False)
DF = pd.read_csv('values1.csv', sep=",")
DF_small = DF[['col2','col3','col4','col5','col6']]
DF_small['centered mean'] = DF_small.apply(lambda x: x.drop([x.idxmax(),x.idxmin()]).mean() , axis = 1)
如您所见,我们删除了最大值和最小值。这个returns
col2 col3 col4 col5 col6 centered mean
0 11.630 11.623 11.640 11.648 11.658 11.639333
1 11.552 11.541 11.592 11.627 11.629 11.590333
2 11.704 11.739 11.697 11.683 11.699 11.700000
3 11.688 11.693 11.718 11.752 11.728 11.713000
4 11.654 11.677 11.680 11.702 11.711 11.686333
.. ... ... ... ... ... ...
626 11.613 11.564 11.631 11.632 11.613 11.619000
627 11.672 11.618 11.683 11.688 11.698 11.681000
628 11.654 11.614 11.670 11.672 11.663 11.662333
629 11.536 11.524 11.559 11.571 11.569 11.554667
630 11.647 11.641 11.660 11.664 11.673 11.657000
更新:
如果您想保留所有其他列:
DF['centered mean'] = DF[['col2','col3','col4','col5','col6']].apply(lambda x: x.drop([x.idxmax(),x.idxmin()]).mean() , axis = 1)
给出:
col1 col2 col3 col4 col5 col6 col7 col8 col9 \
0 3116.55170 11.630 11.623 11.640 11.648 11.658 0.054 0.054 0.057
1 1885.76359 11.552 11.541 11.592 11.627 11.629 0.037 0.049 0.035
2 1888.79953 11.704 11.739 11.697 11.683 11.699 0.044 0.064 0.045
3 1899.81685 11.688 11.693 11.718 11.752 11.728 0.039 0.046 0.032
4 1915.74475 11.654 11.677 11.680 11.702 11.711 0.038 0.054 0.037
.. ... ... ... ... ... ... ... ... ...
626 3385.79064 11.613 11.564 11.631 11.632 11.613 0.041 0.051 0.040
627 3391.76317 11.672 11.618 11.683 11.688 11.698 0.035 0.048 0.032
628 3448.69080 11.654 11.614 11.670 11.672 11.663 0.038 0.053 0.036
629 3450.66562 11.536 11.524 11.559 11.571 11.569 0.047 0.068 0.041
630 3452.79152 11.647 11.641 11.660 11.664 11.673 0.044 0.066 0.040
col10 col11 col12 col13 centered mean
0 0.067 0.079 B 99788 11.639333
1 0.042 0.054 A 2479 11.590333
2 0.054 0.064 A 2974 11.700000
3 0.039 0.049 A 3272 11.713000
4 0.044 0.054 A 4974 11.686333
.. ... ... ... ... ...
626 0.045 0.052 A 126087 11.619000
627 0.037 0.045 A 126876 11.681000
628 0.042 0.049 A 134925 11.662333
629 0.045 0.049 B 135255 11.554667
630 0.048 0.054 A 135545 11.657000
[631 rows x 14 columns]