计算每行的平均值

Question

我正在使用 Jupyter notebook 获取此数据：

df = ascii.read ('http://www.astrouw.edu.pl/cgi-asas/asas_cgi_get_data?110545-5433.5,asas3')

第 2-6 列是量级。我想连续删除每 5 种可能性的最小值和最大值，并计算剩余三个量级的平均值。

我该怎么做？以及如何重命名列？

Answer 1

一种方法是将您的 table 转换为 pandas 数据框，然后计算居中平均值：

import pandas as pd
import numpy as np
import io
from io import StringIO
from astropy.io import ascii

df = ascii.read ('http://www.astrouw.edu.pl/cgi-asas/asas_cgi_get_data?110545-5433.5,asas3')
ascii.write(df,'values1.csv', format='csv', fast_writer=False) 

DF = pd.read_csv('values1.csv', sep=",")
DF_small = DF[['col2','col3','col4','col5','col6']]

DF_small['centered mean'] = DF_small.apply(lambda x: x.drop([x.idxmax(),x.idxmin()]).mean() , axis = 1)

如您所见，我们删除了最大值和最小值。这个returns

col2    col3    col4    col5    col6  centered mean
0    11.630  11.623  11.640  11.648  11.658      11.639333
1    11.552  11.541  11.592  11.627  11.629      11.590333
2    11.704  11.739  11.697  11.683  11.699      11.700000
3    11.688  11.693  11.718  11.752  11.728      11.713000
4    11.654  11.677  11.680  11.702  11.711      11.686333
..      ...     ...     ...     ...     ...            ...
626  11.613  11.564  11.631  11.632  11.613      11.619000
627  11.672  11.618  11.683  11.688  11.698      11.681000
628  11.654  11.614  11.670  11.672  11.663      11.662333
629  11.536  11.524  11.559  11.571  11.569      11.554667
630  11.647  11.641  11.660  11.664  11.673      11.657000

更新:

如果您想保留所有其他列：

DF['centered mean'] = DF[['col2','col3','col4','col5','col6']].apply(lambda x: x.drop([x.idxmax(),x.idxmin()]).mean() , axis = 1)

给出：

col1    col2    col3    col4    col5    col6   col7   col8   col9  \
0    3116.55170  11.630  11.623  11.640  11.648  11.658  0.054  0.054  0.057   
1    1885.76359  11.552  11.541  11.592  11.627  11.629  0.037  0.049  0.035   
2    1888.79953  11.704  11.739  11.697  11.683  11.699  0.044  0.064  0.045   
3    1899.81685  11.688  11.693  11.718  11.752  11.728  0.039  0.046  0.032   
4    1915.74475  11.654  11.677  11.680  11.702  11.711  0.038  0.054  0.037   
..          ...     ...     ...     ...     ...     ...    ...    ...    ...   
626  3385.79064  11.613  11.564  11.631  11.632  11.613  0.041  0.051  0.040   
627  3391.76317  11.672  11.618  11.683  11.688  11.698  0.035  0.048  0.032   
628  3448.69080  11.654  11.614  11.670  11.672  11.663  0.038  0.053  0.036   
629  3450.66562  11.536  11.524  11.559  11.571  11.569  0.047  0.068  0.041   
630  3452.79152  11.647  11.641  11.660  11.664  11.673  0.044  0.066  0.040   

     col10  col11 col12   col13  centered mean  
0    0.067  0.079     B   99788      11.639333  
1    0.042  0.054     A    2479      11.590333  
2    0.054  0.064     A    2974      11.700000  
3    0.039  0.049     A    3272      11.713000  
4    0.044  0.054     A    4974      11.686333  
..     ...    ...   ...     ...            ...  
626  0.045  0.052     A  126087      11.619000  
627  0.037  0.045     A  126876      11.681000  
628  0.042  0.049     A  134925      11.662333  
629  0.045  0.049     B  135255      11.554667  
630  0.048  0.054     A  135545      11.657000  

[631 rows x 14 columns]

计算每行的平均值

Counting means from each row

python

rename

mean

python-3.x

jupyter-notebook