为第三列中的每个值添加一列，其中填充的值对应于另一列中的最大值

Question

有这个数据框

    state   in  out 
0   case_1   1  -5  
1   case_2   0  -1  
2   case_2  -1   8
3   case_1  -2   5
4   case_2  -2   1

我需要为每个“状态”（case_1、case_2）创建一个额外的列，其中包含“in”中的值，这些值对应于“out”中的最大值]

    state   in  out  new
0   case_1   1  -5   -2
1   case_2   0  -1   -1
2   case_2  -1   8   -1
3   case_1  -2   5   -2
4   case_2  -2   1   -1

Answer 1

尝试：

df['new'] = df.loc[df['state'].map(df.groupby('state')['out'].idxmax()), 'in'].values
print(df)

# Output:
   state  in  out  new
0  case1   1   -5   -2
1  case2   0   -1   -1
2  case2  -1    8   -1
3  case1  -2    5   -2
4  case2  -2    1   -1

Answer 2

让我们试试transform

df['new'] = df.set_index('in').groupby('state')['out'].transform('idxmax').values
df
Out[99]: 
   state  in  out  new
0  case1   1   -5   -2
1  case2   0   -1   -1
2  case2  -1    8   -1
3  case1  -2    5   -2
4  case2  -2    1   -1

为第三列中的每个值添加一列，其中填充的值对应于另一列中的最大值

Add a column filled with values that correspond to the max value in another column for every value in third column

python

dataframe

python-3.x

pandas

pandas-groupby