Python dataframe 找到 top-5 的索引，然后索引到另一列

Question

我有一个包含两个数字列 A 和 B 的数据框。我想从列 A 中找到前 5 个值，return 从列 B 中的值保存在前 5 个位置。

非常感谢。

Answer 1

我认为前 5 行需要 DataFrame.nlargest 列 A，然后是 select 列 B:

df = pd.DataFrame({'A':[4,5,26,43,54,36,18,7,8,9],
                   'B':range(10)})

print (df)
    A  B
0   4  0
1   5  1
2  26  2
3  43  3
4  54  4
5  36  5
6  18  6
7   7  7
8   8  8
9   9  9

print (df.nlargest(5, 'A'))
    A  B
4  54  4
3  43  3
5  36  5
2  26  2
6  18  6

a = df.nlargest(5, 'A')['B']
print (a)
4    4
3    3
5    5
2    2
6    6
Name: B, dtype: int64

带排序的替代解决方案：

a = df.sort_values('A', ascending=False)['B'].head(5)
print (a)
4    4
3    3
5    5
2    2
6    6
Name: B, dtype: int64

Answer 2

nlargest 数据框上的函数将完成您的工作，df.nlargest(#of rows,'column_to_sort')

import pandas
df = pd.DataFrame({'A':[1,1,1,2,2,2,2,3,4],'B':[1,2,3,1,2,3,4,1,1]})
df.nlargest(5,'B')
Out[13]: 
    A      B
6   2      4
2   1      3
5   2      3
1   1      2
4   2      2
# if you want only certain column in the output, the use

df.nlargest(5,'B')['A']

Python dataframe 找到 top-5 的索引，然后索引到另一列

Python dataframe find index of top-5, then index into another column

python

sorting

top-n

dataframe