Pandas 将列名作为参数传递的 applymap 方法
Pandas applymap method with passing column name as parameter
我想在下面的数据集中使用函数有点复杂的 applymap 方法。
value1 value2 value3 value4 value5 people
147 119 69 92 106 533.0
31 20 12 14 26 103.0
37 22 24 18 19 120.0
10 13 7 13 10 53.0
38 48 18 30 27 161.0
401 409 168 354 338 1670.0
109 92 55 82 69 407.0
5 9 7 11 9 41.0
44 36 21 48 28 177.0
59 40 19 38 27 183.0
8 9 1 7 10 35.0
人员列表示值列的总和。我想用它们的百分比替换值数字。
例如:在第一行中,value1 是 147,第一行中值的总和是 533。我想用 (147/533)*100
替换 147
我认为它看起来像这样。但我无法让它工作。
df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].applymap(lambda x: (x / df['people'])*100)
函数applymap
用于按元素处理DataFrame
的每个值。
最好使用矢量化解决方案 DataFrame.div
:
df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].div(df['people'], axis=0) * 100
print (df)
value1 value2 value3 value4 value5 people
0 27.579737 22.326454 12.945591 17.260788 19.887430 533.0
1 30.097087 19.417476 11.650485 13.592233 25.242718 103.0
2 30.833333 18.333333 20.000000 15.000000 15.833333 120.0
3 18.867925 24.528302 13.207547 24.528302 18.867925 53.0
4 23.602484 29.813665 11.180124 18.633540 16.770186 161.0
5 24.011976 24.491018 10.059880 21.197605 20.239521 1670.0
6 26.781327 22.604423 13.513514 20.147420 16.953317 407.0
7 12.195122 21.951220 17.073171 26.829268 21.951220 41.0
8 24.858757 20.338983 11.864407 27.118644 15.819209 177.0
9 32.240437 21.857923 10.382514 20.765027 14.754098 183.0
10 22.857143 25.714286 2.857143 20.000000 28.571429 35.0
另一个numpy
广播解决方案:
df.loc[:, 'value1':'value5'] = (df.loc[:, 'value1':'value5'].values /
df['people'].values[:, None] * 100)
print (df)
value1 value2 value3 value4 value5 people
0 27.579737 22.326454 12.945591 17.260788 19.887430 533.0
1 30.097087 19.417476 11.650485 13.592233 25.242718 103.0
2 30.833333 18.333333 20.000000 15.000000 15.833333 120.0
3 18.867925 24.528302 13.207547 24.528302 18.867925 53.0
4 23.602484 29.813665 11.180124 18.633540 16.770186 161.0
5 24.011976 24.491018 10.059880 21.197605 20.239521 1670.0
6 26.781327 22.604423 13.513514 20.147420 16.953317 407.0
7 12.195122 21.951220 17.073171 26.829268 21.951220 41.0
8 24.858757 20.338983 11.864407 27.118644 15.819209 177.0
9 32.240437 21.857923 10.382514 20.765027 14.754098 183.0
10 22.857143 25.714286 2.857143 20.000000 28.571429 35.0
如果想要类似于 applymap
的东西可以使用 apply
,但上面的解决方案更快:
df.loc[:, 'value1':'value5'] = )df.loc[:, 'value1':'value5']
.apply(lambda x: (x / df['people'])*100))
print (df)
value1 value2 value3 value4 value5 people
0 27.579737 22.326454 12.945591 17.260788 19.887430 533.0
1 30.097087 19.417476 11.650485 13.592233 25.242718 103.0
2 30.833333 18.333333 20.000000 15.000000 15.833333 120.0
3 18.867925 24.528302 13.207547 24.528302 18.867925 53.0
4 23.602484 29.813665 11.180124 18.633540 16.770186 161.0
5 24.011976 24.491018 10.059880 21.197605 20.239521 1670.0
6 26.781327 22.604423 13.513514 20.147420 16.953317 407.0
7 12.195122 21.951220 17.073171 26.829268 21.951220 41.0
8 24.858757 20.338983 11.864407 27.118644 15.819209 177.0
9 32.240437 21.857923 10.382514 20.765027 14.754098 183.0
10 22.857143 25.714286 2.857143 20.000000 28.571429 35.0
我想在下面的数据集中使用函数有点复杂的 applymap 方法。
value1 value2 value3 value4 value5 people
147 119 69 92 106 533.0
31 20 12 14 26 103.0
37 22 24 18 19 120.0
10 13 7 13 10 53.0
38 48 18 30 27 161.0
401 409 168 354 338 1670.0
109 92 55 82 69 407.0
5 9 7 11 9 41.0
44 36 21 48 28 177.0
59 40 19 38 27 183.0
8 9 1 7 10 35.0
人员列表示值列的总和。我想用它们的百分比替换值数字。 例如:在第一行中,value1 是 147,第一行中值的总和是 533。我想用 (147/533)*100
替换 147我认为它看起来像这样。但我无法让它工作。
df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].applymap(lambda x: (x / df['people'])*100)
函数applymap
用于按元素处理DataFrame
的每个值。
最好使用矢量化解决方案 DataFrame.div
:
df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].div(df['people'], axis=0) * 100
print (df)
value1 value2 value3 value4 value5 people
0 27.579737 22.326454 12.945591 17.260788 19.887430 533.0
1 30.097087 19.417476 11.650485 13.592233 25.242718 103.0
2 30.833333 18.333333 20.000000 15.000000 15.833333 120.0
3 18.867925 24.528302 13.207547 24.528302 18.867925 53.0
4 23.602484 29.813665 11.180124 18.633540 16.770186 161.0
5 24.011976 24.491018 10.059880 21.197605 20.239521 1670.0
6 26.781327 22.604423 13.513514 20.147420 16.953317 407.0
7 12.195122 21.951220 17.073171 26.829268 21.951220 41.0
8 24.858757 20.338983 11.864407 27.118644 15.819209 177.0
9 32.240437 21.857923 10.382514 20.765027 14.754098 183.0
10 22.857143 25.714286 2.857143 20.000000 28.571429 35.0
另一个numpy
广播解决方案:
df.loc[:, 'value1':'value5'] = (df.loc[:, 'value1':'value5'].values /
df['people'].values[:, None] * 100)
print (df)
value1 value2 value3 value4 value5 people
0 27.579737 22.326454 12.945591 17.260788 19.887430 533.0
1 30.097087 19.417476 11.650485 13.592233 25.242718 103.0
2 30.833333 18.333333 20.000000 15.000000 15.833333 120.0
3 18.867925 24.528302 13.207547 24.528302 18.867925 53.0
4 23.602484 29.813665 11.180124 18.633540 16.770186 161.0
5 24.011976 24.491018 10.059880 21.197605 20.239521 1670.0
6 26.781327 22.604423 13.513514 20.147420 16.953317 407.0
7 12.195122 21.951220 17.073171 26.829268 21.951220 41.0
8 24.858757 20.338983 11.864407 27.118644 15.819209 177.0
9 32.240437 21.857923 10.382514 20.765027 14.754098 183.0
10 22.857143 25.714286 2.857143 20.000000 28.571429 35.0
如果想要类似于 applymap
的东西可以使用 apply
,但上面的解决方案更快:
df.loc[:, 'value1':'value5'] = )df.loc[:, 'value1':'value5']
.apply(lambda x: (x / df['people'])*100))
print (df)
value1 value2 value3 value4 value5 people
0 27.579737 22.326454 12.945591 17.260788 19.887430 533.0
1 30.097087 19.417476 11.650485 13.592233 25.242718 103.0
2 30.833333 18.333333 20.000000 15.000000 15.833333 120.0
3 18.867925 24.528302 13.207547 24.528302 18.867925 53.0
4 23.602484 29.813665 11.180124 18.633540 16.770186 161.0
5 24.011976 24.491018 10.059880 21.197605 20.239521 1670.0
6 26.781327 22.604423 13.513514 20.147420 16.953317 407.0
7 12.195122 21.951220 17.073171 26.829268 21.951220 41.0
8 24.858757 20.338983 11.864407 27.118644 15.819209 177.0
9 32.240437 21.857923 10.382514 20.765027 14.754098 183.0
10 22.857143 25.714286 2.857143 20.000000 28.571429 35.0