pandas：如何 select 按列排在前面或最后以与 drop_duplicates 保持一致

Question

如下图，姓名必须在fisrt，团队在last。

如何使用 .drop_duplicates() 或其他方式完成此操作？

   name  team ...
0  john  a    ...
1  mike  b    ...
2  john  c

↓

   name  team ...
0  john  c    ...
1  mike  b    ...

-- 关于评论的补充说明 --

.groupby('name').agg({'team': 'last', 'country': 'first'})

现在的工作方式，如果country的第一行是Nan 如果第一行country是Nan，会得到一个不是first的值，如下

这是因为忽略了Nan的大小写吗？即使指定了first并且first是Nan，Nan仍然必须保留。

   name  team  country ...
0  john   a    Nan     ...
1  mike  b     Brazil  ...
2  john  c     Canada  ...

↓

   name  team  country ...
0  john  c     Canada  ...
1  mike  b     Brazil  ...

Answer 1

您可以使用.groupby()函数：

df.groupby('name').agg({'team': 'last'}).

请注意，每个名称返回的值取决于数据框的排序。

pandas：如何 select 按列排在前面或最后以与 drop_duplicates 保持一致

pandas: how to select first or last by column in keep with drop_duplicates

python

dataframe

pandas