数据框中每行最大值的列名
Column name of maximum of each row in a dataframe
我有一个数据框,我想要一列填充每行的最大值,所以我使用了这个:
df_1['Highest_Rew_patch'] = df_1.max(axis=1)
输出:
Patch_0 Patch_1 Patch_2 ... Patch_7 exp_patch Highest_Rew_patch
0 0.0 70.0 70.0 ... 0.0 3 70.0
1 0.0 74.0 74.0 ... 0.0 4 74.0
2 0.0 78.0 78.0 ... 0.0 4 78.0
3 0.0 82.0 82.0 ... 0.0 4 82.0
4 0.0 82.0 82.0 ... 0.0 5 82.0
5 0.0 86.0 86.0 ... 0.0 6 86.0
6 0.0 90.0 90.0 ... 0.0 6 90.0
7 0.0 94.0 94.0 ... 0.0 6 94.0
8 0.0 98.0 98.0 ... 0.0 6 98.0
9 0.0 98.0 98.0 ... 0.0 7 98.0
但是我想要一些不同的结果:
Patch_0 Patch_1 Patch_2 Patch_7 exp_patch Highest_Rew_patch
0 0.0 70.0 70.0 3 Patch_2,Patch_7...
1 0.0 74.0 74.0 4 Patch_2,Patch_7...
因此,我想要的不是行的最高值,而是具有最高值的特定行的该列的 header。
试试这个:
df['Highest_Rew_patch']=df.filter(like='Patch').apply(lambda x: ', '.join(x[x.eq(x.max())].index), axis=1)
apply
在axis=1
上,然后在,
上加入x
的所有索引等于最大值。
输出:
Patch_0 Patch_1 ... exp_patch Highest_Rew_patch
0 0.0 70.0 ... 3 Patch_1, Patch_2, Patch_4, Patch_6
1 0.0 74.0 ... 4 Patch_1, Patch_2, Patch_6
2 0.0 78.0 ... 4 Patch_1, Patch_2, Patch_6
3 0.0 82.0 ... 4 Patch_1, Patch_2, Patch_6
4 0.0 82.0 ... 5 Patch_1, Patch_2, Patch_6
您可以 select 以 Patch
开头的列,然后只保留值等于最大值的列名称:
>> s = df.iloc[:, df.columns.str.startswith('Patch')].apply(
lambda s: s.index[s.eq(s.max())].tolist(), axis=1)
>> s
0 [Patch_1, Patch_2]
1 [Patch_1, Patch_2]
2 [Patch_1, Patch_2]
3 [Patch_1, Patch_2]
4 [Patch_1, Patch_2]
5 [Patch_1, Patch_2]
6 [Patch_1, Patch_2]
7 [Patch_1, Patch_2]
8 [Patch_1, Patch_2]
9 [Patch_1, Patch_2]
或作为字符串加入:
>> s = s.apply(lambda s: ','.join(s))
>> s
0 Patch_1,Patch_2
1 Patch_1,Patch_2
2 Patch_1,Patch_2
3 Patch_1,Patch_2
4 Patch_1,Patch_2
5 Patch_1,Patch_2
6 Patch_1,Patch_2
7 Patch_1,Patch_2
8 Patch_1,Patch_2
9 Patch_1,Patch_2
dtype: object
然后只需分配新列:
df['Highest_Rew_patch'] = s
我有一个数据框,我想要一列填充每行的最大值,所以我使用了这个:
df_1['Highest_Rew_patch'] = df_1.max(axis=1)
输出:
Patch_0 Patch_1 Patch_2 ... Patch_7 exp_patch Highest_Rew_patch
0 0.0 70.0 70.0 ... 0.0 3 70.0
1 0.0 74.0 74.0 ... 0.0 4 74.0
2 0.0 78.0 78.0 ... 0.0 4 78.0
3 0.0 82.0 82.0 ... 0.0 4 82.0
4 0.0 82.0 82.0 ... 0.0 5 82.0
5 0.0 86.0 86.0 ... 0.0 6 86.0
6 0.0 90.0 90.0 ... 0.0 6 90.0
7 0.0 94.0 94.0 ... 0.0 6 94.0
8 0.0 98.0 98.0 ... 0.0 6 98.0
9 0.0 98.0 98.0 ... 0.0 7 98.0
但是我想要一些不同的结果:
Patch_0 Patch_1 Patch_2 Patch_7 exp_patch Highest_Rew_patch
0 0.0 70.0 70.0 3 Patch_2,Patch_7...
1 0.0 74.0 74.0 4 Patch_2,Patch_7...
因此,我想要的不是行的最高值,而是具有最高值的特定行的该列的 header。
试试这个:
df['Highest_Rew_patch']=df.filter(like='Patch').apply(lambda x: ', '.join(x[x.eq(x.max())].index), axis=1)
apply
在axis=1
上,然后在,
上加入x
的所有索引等于最大值。
输出:
Patch_0 Patch_1 ... exp_patch Highest_Rew_patch
0 0.0 70.0 ... 3 Patch_1, Patch_2, Patch_4, Patch_6
1 0.0 74.0 ... 4 Patch_1, Patch_2, Patch_6
2 0.0 78.0 ... 4 Patch_1, Patch_2, Patch_6
3 0.0 82.0 ... 4 Patch_1, Patch_2, Patch_6
4 0.0 82.0 ... 5 Patch_1, Patch_2, Patch_6
您可以 select 以 Patch
开头的列,然后只保留值等于最大值的列名称:
>> s = df.iloc[:, df.columns.str.startswith('Patch')].apply(
lambda s: s.index[s.eq(s.max())].tolist(), axis=1)
>> s
0 [Patch_1, Patch_2]
1 [Patch_1, Patch_2]
2 [Patch_1, Patch_2]
3 [Patch_1, Patch_2]
4 [Patch_1, Patch_2]
5 [Patch_1, Patch_2]
6 [Patch_1, Patch_2]
7 [Patch_1, Patch_2]
8 [Patch_1, Patch_2]
9 [Patch_1, Patch_2]
或作为字符串加入:
>> s = s.apply(lambda s: ','.join(s))
>> s
0 Patch_1,Patch_2
1 Patch_1,Patch_2
2 Patch_1,Patch_2
3 Patch_1,Patch_2
4 Patch_1,Patch_2
5 Patch_1,Patch_2
6 Patch_1,Patch_2
7 Patch_1,Patch_2
8 Patch_1,Patch_2
9 Patch_1,Patch_2
dtype: object
然后只需分配新列:
df['Highest_Rew_patch'] = s