将数据规范化为特定范围的值
Normalizing data to certain range of values
我是Python的新人,请问有什么函数可以对数据进行归一化吗?
例如,我有一组列表在 0 - 1
范围内 例如:[0.92323, 0.7232322, 0,93832, 0.4344433]
我想将这些所有值标准化到范围 0.25 - 0.50
谢谢,
您可以按照以下方式进行操作:
>>> l = [0.92323, 0.7232322, 0.93832, 0.4344433]
>>> lower, upper = 0.25, 0.5
>>> l_norm = [lower + (upper - lower) * x for x in l]
>>> l_norm
[0.4808075, 0.43080805, 0.48458, 0.35861082499999997]
您可以使用 sklearn.preprocessing
进行多种类型的预处理任务,包括归一化。
以下函数考虑了一般情况:
def normalize(values, bounds):
return [bounds['desired']['lower'] + (x - bounds['actual']['lower']) * (bounds['desired']['upper'] - bounds['desired']['lower']) / (bounds['actual']['upper'] - bounds['actual']['lower']) for x in values]
使用:
normalize(
[0.92323, 0.7232322, 0.93832, 0.4344433],
{'actual': {'lower': 0, 'upper': 1}, 'desired': {'lower': 0.25, 'upper': 0.5}}
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]
normalize(
[5, 7.5, 10, 12.5, 15],
{'actual':{'lower':5,'upper':15},'desired':{'lower':1,'upper':2}}
) # [1.0, 1.25, 1.5, 1.75, 2.0]
我选择了一个两级字典作为参数,但你可以用多种方式给出它,例如在两个单独的元组中,一个用于实际边界,另一个用于所需边界,作为下边界的第一个元素第二个是上层:
def normalize(values, actual_bounds, desired_bounds):
return [desired_bounds[0] + (x - actual_bounds[0]) * (desired_bounds[1] - desired_bounds[0]) / (actual_bounds[1] - actual_bounds[0]) for x in values]
使用:
normalize(
[0.92323, 0.7232322, 0.93832, 0.4344433],
(0,1),
(0.25,0.5)
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]
normalize(
[5, 7.5, 10, 12.5, 15],
(5,15),
(1,2)
) # [1.0, 1.25, 1.5, 1.75, 2.0]
我是Python的新人,请问有什么函数可以对数据进行归一化吗?
例如,我有一组列表在 0 - 1
范围内 例如:[0.92323, 0.7232322, 0,93832, 0.4344433]
我想将这些所有值标准化到范围 0.25 - 0.50
谢谢,
您可以按照以下方式进行操作:
>>> l = [0.92323, 0.7232322, 0.93832, 0.4344433]
>>> lower, upper = 0.25, 0.5
>>> l_norm = [lower + (upper - lower) * x for x in l]
>>> l_norm
[0.4808075, 0.43080805, 0.48458, 0.35861082499999997]
您可以使用 sklearn.preprocessing
进行多种类型的预处理任务,包括归一化。
以下函数考虑了一般情况:
def normalize(values, bounds):
return [bounds['desired']['lower'] + (x - bounds['actual']['lower']) * (bounds['desired']['upper'] - bounds['desired']['lower']) / (bounds['actual']['upper'] - bounds['actual']['lower']) for x in values]
使用:
normalize(
[0.92323, 0.7232322, 0.93832, 0.4344433],
{'actual': {'lower': 0, 'upper': 1}, 'desired': {'lower': 0.25, 'upper': 0.5}}
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]
normalize(
[5, 7.5, 10, 12.5, 15],
{'actual':{'lower':5,'upper':15},'desired':{'lower':1,'upper':2}}
) # [1.0, 1.25, 1.5, 1.75, 2.0]
我选择了一个两级字典作为参数,但你可以用多种方式给出它,例如在两个单独的元组中,一个用于实际边界,另一个用于所需边界,作为下边界的第一个元素第二个是上层:
def normalize(values, actual_bounds, desired_bounds):
return [desired_bounds[0] + (x - actual_bounds[0]) * (desired_bounds[1] - desired_bounds[0]) / (actual_bounds[1] - actual_bounds[0]) for x in values]
使用:
normalize(
[0.92323, 0.7232322, 0.93832, 0.4344433],
(0,1),
(0.25,0.5)
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]
normalize(
[5, 7.5, 10, 12.5, 15],
(5,15),
(1,2)
) # [1.0, 1.25, 1.5, 1.75, 2.0]