从 python 中的 groupby 对象中选择特定行

Question

id    marks  year 
1     18      2013
1     25      2012
3     16      2014
2     16      2013
1     19      2013
3     25      2013
2     18      2014

假设现在我通过 python 命令对上面的 id 进行分组。
分组 = file.groupby(file.id)

我想得到一个新文件，其中每组中只有最近一年的行是该组中全年中最高的。

请告诉我命令，我正在尝试使用 apply 但它只会给出布尔表达式。我想要最新年份的整行。

Answer 1

我用这个拼凑起来的：Python : Getting the Row which has the max value in groups using groupby

所以基本上我们可以按 'id' 列分组，然后在 'year' 列上调用 transform 并创建一个布尔索引，其中年份与每个 [= 的最大年份值相匹配17=]:

In [103]:

df[df.groupby(['id'])['year'].transform(max) == df['year']]
Out[103]:
   id  marks  year
0   1     18  2013
2   3     16  2014
4   1     19  2013
6   2     18  2014

从 python 中的 groupby 对象中选择特定行

selecting a particular row from groupby object in python

python

group-by

pandas