sklearn normalize() 将每个值生成为 1

Question

我正在尝试将单个特征标准化为 [0, 1]，但我得到的结果都是浮点值 1，这显然是错误的。

import pandas as pd
import numpy as np
from sklearn.preprocessing import normalize

test = pd.DataFrame(data=[7, 6, 5, 2, 9, 9, 7, 8, 6, 5], columns=['data'])
normalize(test['data'].values.reshape(-1, 1))

这会产生以下输出：

array([[1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.]])

我认为这可能是 int 到 float 数据类型的问题，所以我尝试先转换为 float，normalize(test['data'].astype(float).values.reshape(-1, 1))，但这给出了相同的结果。我错过了什么？

Answer 1

这是因为默认的axis是1。

设置axis = 0:

normalize(test['data'].values.reshape(-1, 1), axis=0)

输出：

array([[0.32998316],
       [0.28284271],
       [0.23570226],
       [0.0942809 ],
       [0.42426407],
       [0.42426407],
       [0.32998316],
       [0.37712362],
       [0.28284271],
       [0.23570226]])

Answer 2

我觉得我们可以使用

(test.data-test.data.min())/np.ptp(test.data.values)
Out[136]: 
0    0.714286
1    0.571429
2    0.428571
3    0.000000
4    1.000000
5    1.000000
6    0.714286
7    0.857143
8    0.571429
9    0.428571
Name: data, dtype: float64

sklearn normalize() 将每个值生成为 1

sklearn normalize() produces every value as 1

python

normalization

pandas

scikit-learn