根据其他数据框中的列值替换数据框中的值
Substitute Value in dataframe based on column value in other dataframe
我有一个数据框:
section name overall admission room
0 Supriya Bachal 4432837753 4431710642 4431711344
1 Meena Kumari 4432837752 4431710642 4431711344
2 Sunita Banik 4432837752 4431710643 4431711346
3 Madhuri Bhat 4432837753 4431710643 4431711347
4 Arushi Sharda 4432837753 4431710643 4431711347
5 Vishwas Kini 4432837753 4431710643 4431711347
6 Nishit goyal 4432837752 4431710642 4431711346
7 Shibiraj Soni 4432837753 NaN 4431711347
和其他数据框:
rating overall admission room
0 1 4432837749 4431710639 4431711343
1 2 4432837750 4431710640 4431711344
2 3 4432837751 4431710641 4431711345
3 4 4432837752 4431710642 4431711346
4 5 4432837753 4431710643 4431711347
它显示了不同部分(整体、入场费和房间)到评分(1 到 5)的映射。
现在我想用他们的 ID 代替评分
最终数据帧:
section name overall admission room
0 Supriya Bachal 5 4 2
1 Meena Kumari 4 4 2
2 Sunita Banik 4 5 4
3 Madhuri Bhat 5 5 5
4 Arushi Sharda 5 5 5
5 Vishwas Kini 5 5 5
6 Nishit goyal 4 4 4
7 Shibiraj Soni 5 NaN 5
我们有 10 个这样的列,如果对每个列都做 if else 是行不通的
有什么方法可以轻松做到这一点
TIA
您可以使用设置索引值映射这些值
df3 = df[['section','name']]
for col in ['overall','admission', 'room']:
df3[col] = df[col].map(df1.set_index(col)['rating'])
输出:
name overall admission room
0 Supriya Bachal 5 4.0 2
1 Meena Kumari 4 4.0 2
2 Sunita Banik 4 5.0 4
3 Madhuri Bhat 5 5.0 5
4 Arushi Sharda 5 5.0 5
5 Vishwas Kini 5 5.0 5
6 Nishit goyal 4 4.0 4
7 Shibiraj Soni 5 NaN 5
编辑 1
#Time taken by solutions
df3 = df[['section','name']]
for col in ['overall','admission', 'room']:
df3[col] = df[col].map(df1.set_index(col)['rating'])
2.42 ms ± 70.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
#Shubham solution
%%timeit
df.replace(df1.melt('rating').pivot('value', 'variable', 'rating'))
4.82 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
DataFrame.replace
df1.replace(df2.melt('rating').pivot('value', 'variable', 'rating'))
section name overall admission room
0 0 Supriya Bachal 5.0 4.0 2.0
1 1 Meena Kumari 4.0 4.0 2.0
2 2 Sunita Banik 4.0 5.0 4.0
3 3 Madhuri Bhat 5.0 5.0 5.0
4 4 Arushi Sharda 5.0 5.0 5.0
5 5 Vishwas Kini 5.0 5.0 5.0
6 6 Nishit goyal 4.0 4.0 4.0
7 7 Shibiraj Soni 5.0 NaN 5.0
我有一个数据框:
section name overall admission room
0 Supriya Bachal 4432837753 4431710642 4431711344
1 Meena Kumari 4432837752 4431710642 4431711344
2 Sunita Banik 4432837752 4431710643 4431711346
3 Madhuri Bhat 4432837753 4431710643 4431711347
4 Arushi Sharda 4432837753 4431710643 4431711347
5 Vishwas Kini 4432837753 4431710643 4431711347
6 Nishit goyal 4432837752 4431710642 4431711346
7 Shibiraj Soni 4432837753 NaN 4431711347
和其他数据框:
rating overall admission room
0 1 4432837749 4431710639 4431711343
1 2 4432837750 4431710640 4431711344
2 3 4432837751 4431710641 4431711345
3 4 4432837752 4431710642 4431711346
4 5 4432837753 4431710643 4431711347
它显示了不同部分(整体、入场费和房间)到评分(1 到 5)的映射。
现在我想用他们的 ID 代替评分
最终数据帧:
section name overall admission room
0 Supriya Bachal 5 4 2
1 Meena Kumari 4 4 2
2 Sunita Banik 4 5 4
3 Madhuri Bhat 5 5 5
4 Arushi Sharda 5 5 5
5 Vishwas Kini 5 5 5
6 Nishit goyal 4 4 4
7 Shibiraj Soni 5 NaN 5
我们有 10 个这样的列,如果对每个列都做 if else 是行不通的
有什么方法可以轻松做到这一点
TIA
您可以使用设置索引值映射这些值
df3 = df[['section','name']]
for col in ['overall','admission', 'room']:
df3[col] = df[col].map(df1.set_index(col)['rating'])
输出:
name overall admission room
0 Supriya Bachal 5 4.0 2
1 Meena Kumari 4 4.0 2
2 Sunita Banik 4 5.0 4
3 Madhuri Bhat 5 5.0 5
4 Arushi Sharda 5 5.0 5
5 Vishwas Kini 5 5.0 5
6 Nishit goyal 4 4.0 4
7 Shibiraj Soni 5 NaN 5
编辑 1
#Time taken by solutions
df3 = df[['section','name']]
for col in ['overall','admission', 'room']:
df3[col] = df[col].map(df1.set_index(col)['rating'])
2.42 ms ± 70.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
#Shubham solution
%%timeit
df.replace(df1.melt('rating').pivot('value', 'variable', 'rating'))
4.82 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
DataFrame.replace
df1.replace(df2.melt('rating').pivot('value', 'variable', 'rating'))
section name overall admission room
0 0 Supriya Bachal 5.0 4.0 2.0
1 1 Meena Kumari 4.0 4.0 2.0
2 2 Sunita Banik 4.0 5.0 4.0
3 3 Madhuri Bhat 5.0 5.0 5.0
4 4 Arushi Sharda 5.0 5.0 5.0
5 5 Vishwas Kini 5.0 5.0 5.0
6 6 Nishit goyal 4.0 4.0 4.0
7 7 Shibiraj Soni 5.0 NaN 5.0