使用 groupby pandas 仅聚合一个重复值

Aggregate only one of the duplicated values with groupby pandas

我有以下数据,最后一列是所需的输出:

activity teacher group students the desired column
One A a 3 5
One B b 2 5
two A c 7 7
One D a 3 5
two C c 7 7

我想按 activity 分组,当我们有多个老师时,返回学生人数而不重复学生。 我尝试了以下但它重复了同一组的总和。

df.groupby('activity').students.transform('sum')

输出如下:

activity teacher group students the output column
One A a 3 8
One B b 2 8
two A c 7 14
One A a 3 8
two C c 7 14

提前感谢您的任何建议。

IIUC:

x = (
    df.drop_duplicates(subset=["activity", "group"])
    .groupby("activity")["students"]
    .sum()
)
df["the desired column"] = df["activity"].map(x)
print(df)

打印:

  activity teacher group  students  the desired column
0      One       A     a         3                   5
1      One       B     b         2                   5
2      two       A     c         7                   7
3      One       D     a         3                   5
4      two       C     c         7                   7