Pandas 将列名作为参数传递的 applymap 方法

Pandas applymap method with passing column name as parameter

我想在下面的数据集中使用函数有点复杂的 applymap 方法。

 value1 value2 value3 value4 value5  people

   147    119     69     92    106   533.0
    31     20     12     14     26   103.0
    37     22     24     18     19   120.0
    10     13      7     13     10    53.0
    38     48     18     30     27   161.0
   401    409    168    354    338  1670.0
   109     92     55     82     69   407.0
     5      9      7     11      9    41.0
    44     36     21     48     28   177.0
    59     40     19     38     27   183.0
     8      9      1      7     10    35.0

人员列表示值列的总和。我想用它们的百分比替换值数字。 例如:在第一行中,value1 是 147,第一行中值的总和是 533。我想用 (147/533)*100

替换 147

我认为它看起来像这样。但我无法让它工作。

df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].applymap(lambda x: (x / df['people'])*100)

函数applymap用于按元素处理DataFrame的每个值。

最好使用矢量化解决方案 DataFrame.div:

df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].div(df['people'], axis=0) * 100
print (df)
       value1     value2     value3     value4     value5  people
0   27.579737  22.326454  12.945591  17.260788  19.887430   533.0
1   30.097087  19.417476  11.650485  13.592233  25.242718   103.0
2   30.833333  18.333333  20.000000  15.000000  15.833333   120.0
3   18.867925  24.528302  13.207547  24.528302  18.867925    53.0
4   23.602484  29.813665  11.180124  18.633540  16.770186   161.0
5   24.011976  24.491018  10.059880  21.197605  20.239521  1670.0
6   26.781327  22.604423  13.513514  20.147420  16.953317   407.0
7   12.195122  21.951220  17.073171  26.829268  21.951220    41.0
8   24.858757  20.338983  11.864407  27.118644  15.819209   177.0
9   32.240437  21.857923  10.382514  20.765027  14.754098   183.0
10  22.857143  25.714286   2.857143  20.000000  28.571429    35.0

另一个numpy广播解决方案:

df.loc[:, 'value1':'value5'] = (df.loc[:, 'value1':'value5'].values / 
                                     df['people'].values[:, None] * 100)
print (df)
       value1     value2     value3     value4     value5  people
0   27.579737  22.326454  12.945591  17.260788  19.887430   533.0
1   30.097087  19.417476  11.650485  13.592233  25.242718   103.0
2   30.833333  18.333333  20.000000  15.000000  15.833333   120.0
3   18.867925  24.528302  13.207547  24.528302  18.867925    53.0
4   23.602484  29.813665  11.180124  18.633540  16.770186   161.0
5   24.011976  24.491018  10.059880  21.197605  20.239521  1670.0
6   26.781327  22.604423  13.513514  20.147420  16.953317   407.0
7   12.195122  21.951220  17.073171  26.829268  21.951220    41.0
8   24.858757  20.338983  11.864407  27.118644  15.819209   177.0
9   32.240437  21.857923  10.382514  20.765027  14.754098   183.0
10  22.857143  25.714286   2.857143  20.000000  28.571429    35.0

如果想要类似于 applymap 的东西可以使用 apply,但上面的解决方案更快:

df.loc[:, 'value1':'value5'] = )df.loc[:, 'value1':'value5']
                                   .apply(lambda x: (x / df['people'])*100))
print (df)
       value1     value2     value3     value4     value5  people
0   27.579737  22.326454  12.945591  17.260788  19.887430   533.0
1   30.097087  19.417476  11.650485  13.592233  25.242718   103.0
2   30.833333  18.333333  20.000000  15.000000  15.833333   120.0
3   18.867925  24.528302  13.207547  24.528302  18.867925    53.0
4   23.602484  29.813665  11.180124  18.633540  16.770186   161.0
5   24.011976  24.491018  10.059880  21.197605  20.239521  1670.0
6   26.781327  22.604423  13.513514  20.147420  16.953317   407.0
7   12.195122  21.951220  17.073171  26.829268  21.951220    41.0
8   24.858757  20.338983  11.864407  27.118644  15.819209   177.0
9   32.240437  21.857923  10.382514  20.765027  14.754098   183.0
10  22.857143  25.714286   2.857143  20.000000  28.571429    35.0