将数据规范化为特定范围的值

Normalizing data to certain range of values

我是Python的新人,请问有什么函数可以对数据进行归一化吗?

例如,我有一组列表在 0 - 1 范围内 例如:[0.92323, 0.7232322, 0,93832, 0.4344433]

我想将这些所有值标准化到范围 0.25 - 0.50

谢谢,

您可以按照以下方式进行操作:

>>> l = [0.92323, 0.7232322, 0.93832, 0.4344433]
>>> lower, upper = 0.25, 0.5
>>> l_norm = [lower + (upper - lower) * x for x in l]
>>> l_norm
[0.4808075, 0.43080805, 0.48458, 0.35861082499999997]

您可以使用 sklearn.preprocessing 进行多种类型的预处理任务,包括归一化。

以下函数考虑了一般情况:

def normalize(values, bounds):
    return [bounds['desired']['lower'] + (x - bounds['actual']['lower']) * (bounds['desired']['upper'] - bounds['desired']['lower']) / (bounds['actual']['upper'] - bounds['actual']['lower']) for x in values]

使用:

normalize(
    [0.92323, 0.7232322, 0.93832, 0.4344433],
    {'actual': {'lower': 0, 'upper': 1}, 'desired': {'lower': 0.25, 'upper': 0.5}}
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]

normalize(
    [5, 7.5, 10, 12.5, 15],
    {'actual':{'lower':5,'upper':15},'desired':{'lower':1,'upper':2}}
) # [1.0, 1.25, 1.5, 1.75, 2.0]

我选择了一个两级字典作为参数,但你可以用多种方式给出它,例如在两个单独的元组中,一个用于实际边界,另一个用于所需边界,作为下边界的第一个元素第二个是上层:

def normalize(values, actual_bounds, desired_bounds):
    return [desired_bounds[0] + (x - actual_bounds[0]) * (desired_bounds[1] - desired_bounds[0]) / (actual_bounds[1] - actual_bounds[0]) for x in values]

使用:

   normalize(
    [0.92323, 0.7232322, 0.93832, 0.4344433],
    (0,1),
    (0.25,0.5)
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]

normalize(
    [5, 7.5, 10, 12.5, 15],
    (5,15),
    (1,2)
) # [1.0, 1.25, 1.5, 1.75, 2.0]