如何在 pandas 数据帧中对 0 到 1 之间的变量进行归一化

How to do normalization of variables between 0 to 1 in pandas dataframe

我想根据

的公式对每个组的以下数据集进行归一化

(x-min(x))/(max(x)-min(x))

每组。我怎样才能在 pandas 数据框中做到这一点?我需要对价格和尺寸进行标准化吗?谢谢。

data = [['Group 1',10,100],
       ['Group 1',20,80],
       ['Group 1',15,60],
       ['Group 1',10,120],
       ['Group 2',10,120],
       ['Group 2',20,130],
       ['Group 2',30,200],
       ['Group 2',40,250],
       ['Group 2',50,300]] 
  df = pd.DataFrame(data, columns = ['Group','price','size']) 

GroupBy.apply 与自定义函数一起使用:

cols = ['price','size']
df[cols] = df.groupby('Group')[cols].apply(lambda x: (x-x.min())/(x.max()-x.min()))
print (df)
     Group  price      size
0  Group 1   0.00  0.666667
1  Group 1   1.00  0.333333
2  Group 1   0.50  0.000000
3  Group 1   0.00  1.000000
4  Group 2   0.00  0.000000
5  Group 2   0.25  0.055556
6  Group 2   0.50  0.444444
7  Group 2   0.75  0.722222
8  Group 2   1.00  1.000000

GroupBy.transform:

cols = ['price','size']

g = df.groupby('Group')[cols]
min1 = g.transform('min')
max1 = g.transform('max')
df1 = df.join(df[cols].sub(min1).div(max1 - min1).add_suffix('_norm'))
print (df1)
     Group  price  size  price_norm  size_norm
0  Group 1     10   100        0.00   0.666667
1  Group 1     20    80        1.00   0.333333
2  Group 1     15    60        0.50   0.000000
3  Group 1     10   120        0.00   1.000000
4  Group 2     10   120        0.00   0.000000
5  Group 2     20   130        0.25   0.055556
6  Group 2     30   200        0.50   0.444444
7  Group 2     40   250        0.75   0.722222
8  Group 2     50   300        1.00   1.000000

使用groupby and transform:

df[['normalized_price', 'normalized_size']]= df.groupby('Group').transform(lambda x: (x - x.min())/ (x.max() - x.min()))
df
        Group   price   size    normalized_price    normalized_size
0       Group 1 10      100         0.00            0.666667
1       Group 1 20      80          1.00            0.333333
2       Group 1 15      60          0.50            0.000000
3       Group 1 10      120         0.00            1.000000
4       Group 2 10      120         0.00            0.000000
5       Group 2 20      130         0.25            0.055556
6       Group 2 30      200         0.50            0.444444
7       Group 2 40      250         0.75            0.722222
8       Group 2 50      300         1.00            1.000000