为什么 numpy.vectorize() 会改变标量函数的除法输出?
Why is numpy.vectorize() changing the division output of a scalar function?
当我用 numpy 向量化一个函数时,我得到了一个奇怪的结果。
import numpy as np
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
out = x * y
else:
out = x/y
return out
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function)
return v_scalar_function(x, y)
我们有
scalar_function(4,3)
# 1.3333333333333333
为什么矢量化版本给出了这个奇怪的输出?
vector_function(np.array([3,4]), np.array([4,3]))
[12 1]
虽然对矢量化版本的调用工作正常:
vector_function(np.array([4,4]), np.array([4,3]))
[1. 1.33333333]
阅读numpy.divide:
Notes
The floor division operator // was added in Python 2.2 making // and / equivalent operators. The default floor division operation of / can be replaced by true division with from __future__
import division.
In Python 3.0, // is the floor division operator and / the true division operator. The true_divide(x1, x2) function is equivalent to true division in Python.
让我觉得这可能是与 python2 相关的遗留问题?
但我正在使用 python 3!
检查触发了哪些语句:
import numpy as np
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
print('if x: ',x)
print('if y: ',y)
out = x * y
print('if out', out)
else:
print('else x: ',x)
print('else y: ',y)
out = x/y
print('else out', out)
return out
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function)
return v_scalar_function(x, y)
vector_function(np.array([3,4]), np.array([4,3]))
if x: 3
if y: 4
if out 12
if x: 3
if y: 4
if out 12
else x: 4
else y: 3
else out 1.3333333333333333 # <-- seems that the value is calculated correctly, but the wrong dtype is returned
因此,您可以重写标量函数:
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
out = x * y
else:
out = x/y
return float(out)
vector_function(np.array([3,4]), np.array([4,3]))
array([12. , 1.33333333])
numpy.vectorize
状态的文档:
The output type is determined by evaluating the first element of the
input, unless it is specified
由于您没有指定 return 数据类型,并且第一个示例是整数乘法,因此第一个数组也是整数类型并对值进行舍入。相反,当第一个操作是除法时,数据类型会自动向上转换为浮点型。您可以通过在 vector_function
中指定一个 dtype 来修复您的代码(对于这个问题,它不一定必须像 64 位一样大):
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function, otypes=[np.float64])
return v_scalar_function(x, y)
另外,您还应该从同一份文档中注意到 numpy.vectorize
是一个方便的函数,基本上只是包装了一个 Python for
循环,因此在某种意义上没有向量化它提供了任何真正的性能提升。
对于这样的二元选择,更好的整体方法是:
def vectorized_scalar_function(arr_1, arr_2):
return np.where(arr_1 < arr_2, arr_1 * arr_2, arr_1 / arr_2)
print(vectorized_scalar_function(np.array([4,4]), np.array([4,3])))
print(vectorized_scalar_function(np.array([3,4]), np.array([4,3])))
以上应该快几个数量级,并且(可能是巧合而不是依赖的硬性规则)结果不会遇到类型转换问题。
当我用 numpy 向量化一个函数时,我得到了一个奇怪的结果。
import numpy as np
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
out = x * y
else:
out = x/y
return out
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function)
return v_scalar_function(x, y)
我们有
scalar_function(4,3)
# 1.3333333333333333
为什么矢量化版本给出了这个奇怪的输出?
vector_function(np.array([3,4]), np.array([4,3]))
[12 1]
虽然对矢量化版本的调用工作正常:
vector_function(np.array([4,4]), np.array([4,3]))
[1. 1.33333333]
阅读numpy.divide:
Notes The floor division operator // was added in Python 2.2 making // and / equivalent operators. The default floor division operation of / can be replaced by true division with from
__future__
import division. In Python 3.0, // is the floor division operator and / the true division operator. The true_divide(x1, x2) function is equivalent to true division in Python.
让我觉得这可能是与 python2 相关的遗留问题? 但我正在使用 python 3!
检查触发了哪些语句:
import numpy as np
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
print('if x: ',x)
print('if y: ',y)
out = x * y
print('if out', out)
else:
print('else x: ',x)
print('else y: ',y)
out = x/y
print('else out', out)
return out
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function)
return v_scalar_function(x, y)
vector_function(np.array([3,4]), np.array([4,3]))
if x: 3
if y: 4
if out 12
if x: 3
if y: 4
if out 12
else x: 4
else y: 3
else out 1.3333333333333333 # <-- seems that the value is calculated correctly, but the wrong dtype is returned
因此,您可以重写标量函数:
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
out = x * y
else:
out = x/y
return float(out)
vector_function(np.array([3,4]), np.array([4,3]))
array([12. , 1.33333333])
numpy.vectorize
状态的文档:
The output type is determined by evaluating the first element of the input, unless it is specified
由于您没有指定 return 数据类型,并且第一个示例是整数乘法,因此第一个数组也是整数类型并对值进行舍入。相反,当第一个操作是除法时,数据类型会自动向上转换为浮点型。您可以通过在 vector_function
中指定一个 dtype 来修复您的代码(对于这个问题,它不一定必须像 64 位一样大):
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function, otypes=[np.float64])
return v_scalar_function(x, y)
另外,您还应该从同一份文档中注意到 numpy.vectorize
是一个方便的函数,基本上只是包装了一个 Python for
循环,因此在某种意义上没有向量化它提供了任何真正的性能提升。
对于这样的二元选择,更好的整体方法是:
def vectorized_scalar_function(arr_1, arr_2):
return np.where(arr_1 < arr_2, arr_1 * arr_2, arr_1 / arr_2)
print(vectorized_scalar_function(np.array([4,4]), np.array([4,3])))
print(vectorized_scalar_function(np.array([3,4]), np.array([4,3])))
以上应该快几个数量级,并且(可能是巧合而不是依赖的硬性规则)结果不会遇到类型转换问题。