获取另一列中每组唯一值的第一行

Question

我只想从数据框中提取除其中一列 (pandas) 之外的每个值的第一行，例如：

df
   col_A col_B
0      1     x
1      2    xx
2      3    xx
3      4     y
4      5     y

至

df1
  col_A col_B
0      1     x
1      2    xx
2      4     y

Answer 1

firsts = df.groupby('col_B', as_index=False).first()

输出：

>>> firsts
  col_B  col_A
0     x      1
1    xx      2
2     y      4

如果列的顺序很重要：

firsts = df.loc[df.groupby('col_B', as_index=False).first().index]

输出：

>>> firsts
   col_A col_B
0      1     x
1      2    xx
2      3    xx

Get the first row of each group of unique values in another column