Numpy 向量化和原子向量

Question

我想在数组上实现一个类似于 numpy.sum 函数的函数，如预期的那样，例如np.sum([2,3],1) = [3,4] 和 np.sum([1,2],[3,4]) = [4,6].

然而，一个简单的测试实现已经表现得有些笨拙：

import numpy as np

def triv(a, b): return a, b

triv_vec = np.vectorize(fun, otypes = [np.int])
triv_vec([1,2],[3,4])

结果：

array([0, 0])

而不是想要的结果：

array([[1,3], [2,4]])

任何想法，这是怎么回事？谢谢

Answer 1

你需要otypes=[np.int,np.int]:

triv_vec = np.vectorize(triv, otypes=[np.int,np.int])
print triv_vec([1,2],[3,4])
(array([1, 2]), array([3, 4]))

otypes : str 或数据类型列表，可选

The output data type. It must be specified as either a string of typecode characters or a list of data type specifiers. There should be one data type specifier for each output.

Answer 2

我最初的问题是关于矢量化正在进行内部类型转换和运行内部优化循环以及这会对性能产生多大影响这一事实。所以这里是答案：

确实如此，但不是只有 <23%，效果不如我想象的那么好。

import numpy as np

def make_tuple(a, b): return tuple([a, b])

make_tuple_vec = np.vectorize(make_tuple, otypes = [np.int, np.int])

v1 = np.random.random_integers(-5, high = 5, size = 100000)
v2 = np.random.random_integers(-5, high = 5, size = 100000)

%timeit [tuple([i,j]) for i,j in zip(v1,v2)] # ~ 596 µs per loop

%timeit make_tuple_vec(v1, v2) # ~ 544 µs per loop

此外，元组生成函数没有按预期矢量化，例如map 函数 map(make_tuple, v1, v2)，这是明显较松散的竞争，执行时间慢了 100 倍：

%timeit map(make_tuple, v1, v2) # ~ 64.4 ms per loop

Numpy 向量化和原子向量

Numpy vectorize and atomic vectors

python

numpy

vectorization