如何向量化以下 python 代码?
How to vectorize the following python code?
我正在尝试使用 Numpy 和矢量化操作来使一段代码 运行 更快,但我没有成功找到解决方案。如果有人有想法...谢谢。
这是带循环的工作代码:
y = np.zeros(len(tab))
for i in range(len(tab)):
s = 0
for n in range(len(coef[0])):
s += coef[0][n] * ((a + b * np.dot(tab[i], vectors[n])) ** d)
y[i] = s
其中,
- 选项卡:numpy.array(N,M)
- 向量:numpy.array(P,M)
- 系数:numpy.array(1,P)
- a、b、c:常量(a = 0,如果更简单的话)
看起来很糟糕。但这是你需要的吗?
y = array([ sum( [coef[0][n] * ((a + b * np.dot(tab[i], vectors[n])) ** d)
for n in range(len(vectors[0]))] ) for i in range(len(tab)) ])
您可以使用基于 np.einsum
and matrix-multiplication with np.dot
的方法,如下所列 -
# Calculate "((a + b * np.dot(tab[i], vectors[n])) ** d)" part
p1 = (a + b*np.einsum('ij,kj->ki',tab,vectors))**d
# Include "+= coef[0][n] *" part to get the final output
y_vectorized = np.dot(coef,p1)
运行时测试
数据集#1:
这是一个快速运行时测试,将原始循环方法与建议的一些随机值方法进行比较 -
In [168]: N = 50
...: M = 50
...: P = 50
...:
...: tab = np.random.rand(N,M)
...: vectors = np.random.rand(P,M)
...: coef = np.random.rand(1,P)
...:
...: a = 3.233
...: b = 0.4343
...: c = 2.0483
...: d = 3
...:
In [169]: %timeit original_approach(tab,vectors,coef,a,b,c,d)
100 loops, best of 3: 4.18 ms per loop
In [170]: %timeit proposed_approach(tab,vectors,coef,a,b,c,d)
10000 loops, best of 3: 136 µs per loop
数据集#2:
N
、M
和 P
分别为 150
,运行时间为 -
In [196]: %timeit original_approach(tab,vectors,coef,a,b,c,d)
10 loops, best of 3: 37.9 ms per loop
In [197]: %timeit proposed_approach(tab,vectors,coef,a,b,c,d)
1000 loops, best of 3: 1.91 ms per loop
我正在尝试使用 Numpy 和矢量化操作来使一段代码 运行 更快,但我没有成功找到解决方案。如果有人有想法...谢谢。
这是带循环的工作代码:
y = np.zeros(len(tab))
for i in range(len(tab)):
s = 0
for n in range(len(coef[0])):
s += coef[0][n] * ((a + b * np.dot(tab[i], vectors[n])) ** d)
y[i] = s
其中,
- 选项卡:numpy.array(N,M)
- 向量:numpy.array(P,M)
- 系数:numpy.array(1,P)
- a、b、c:常量(a = 0,如果更简单的话)
看起来很糟糕。但这是你需要的吗?
y = array([ sum( [coef[0][n] * ((a + b * np.dot(tab[i], vectors[n])) ** d)
for n in range(len(vectors[0]))] ) for i in range(len(tab)) ])
您可以使用基于 np.einsum
and matrix-multiplication with np.dot
的方法,如下所列 -
# Calculate "((a + b * np.dot(tab[i], vectors[n])) ** d)" part
p1 = (a + b*np.einsum('ij,kj->ki',tab,vectors))**d
# Include "+= coef[0][n] *" part to get the final output
y_vectorized = np.dot(coef,p1)
运行时测试
数据集#1:
这是一个快速运行时测试,将原始循环方法与建议的一些随机值方法进行比较 -
In [168]: N = 50
...: M = 50
...: P = 50
...:
...: tab = np.random.rand(N,M)
...: vectors = np.random.rand(P,M)
...: coef = np.random.rand(1,P)
...:
...: a = 3.233
...: b = 0.4343
...: c = 2.0483
...: d = 3
...:
In [169]: %timeit original_approach(tab,vectors,coef,a,b,c,d)
100 loops, best of 3: 4.18 ms per loop
In [170]: %timeit proposed_approach(tab,vectors,coef,a,b,c,d)
10000 loops, best of 3: 136 µs per loop
数据集#2:
N
、M
和 P
分别为 150
,运行时间为 -
In [196]: %timeit original_approach(tab,vectors,coef,a,b,c,d)
10 loops, best of 3: 37.9 ms per loop
In [197]: %timeit proposed_approach(tab,vectors,coef,a,b,c,d)
1000 loops, best of 3: 1.91 ms per loop