向量化向量值函数
Vectorizing a Vector-Valued Function
我有以下功能:
def h_Y1(X, theta):
EF = X[0]
FF = X[1]
j = X[-2]
k = X[-1]
W = X[2:-2]
Sigma = theta[0]
sigma_xi2 = theta[1]
gamma_alpha = theta[2]
gamma_z = np.array(theta[3:])
gW = gamma_z @ W
eps1 = EF - gamma_alpha * gW
if j == k:
eps2 = FF - (gamma_alpha**2)*gW - gW*sigma_xi2 - Sigma
eps3 = 0
else:
eps2 = 0
eps3 = FF - (gamma_alpha**2)*gW
h1 = [eps1 * Wk for Wk in W]
h2 = [eps2 * Wk for Wk in W]
h3 = [eps3 * Wk for Wk in W]
return np.concatenate([h1, h2, h3])
我需要为存储在 gmmarray
中的一系列值执行函数。具体来说,对于固定的 theta
.
,我希望 gmmarray
的每一行的函数都是 运行 作为函数参数 X
我目前正在使用以下代码执行此操作:
import numpy as np
theta = [0.01, 1, 1, 0, 0]
gmmarray = np.random.random((1120451, 6))
test = np.apply_along_axis(h_Y1, 1, gmmarray, theta = init)
但是,这很慢 - 大约需要 19 秒。我尝试按如下方式矢量化函数:
Vh_Y1 = np.vectorize(h_Y1, signature = '(n),(j)->(i)')
test1 = Vh_Y1(gmmarray, init)
但是,这仍然需要 16 秒。我是不是做错了什么,还是有办法进一步加快速度?
非常感谢!
您可以将完整的 gmmarray
作为 X
参数传递。然后,不用遍历 gmmarray
的每一行,您可以对其列使用向量化操作。
像这样:
def h_Y1_vectorized(X, theta):
EF, FF, W, j, k = np.hsplit(X, [1,2,4,5]) # Column vectors (except W)
Sigma, sigma_xi2, gamma_alpha, *gamma_z = theta
gW = (W @ gamma_z)[:, None] # Ensure column vector
ga_gW = gamma_alpha * gW
FF_ga2gW = FF - gamma_alpha * ga_gW
eps1 = EF - ga_gW
j_equal_k = j == k
eps2 = np.where(j_equal_k, FF_ga2gW - gW * sigma_xi2 - Sigma, 0)
eps3 = np.where(j_equal_k, 0, FF_ga2gW)
h1 = eps1 * W
h2 = eps2 * W
h3 = eps3 * W
return np.hstack([h1, h2, h3])
通话中
>>> h_Y1_vectorized(gmmarray, theta)
产生相同的结果,速度大约提高 100 倍。
我有以下功能:
def h_Y1(X, theta):
EF = X[0]
FF = X[1]
j = X[-2]
k = X[-1]
W = X[2:-2]
Sigma = theta[0]
sigma_xi2 = theta[1]
gamma_alpha = theta[2]
gamma_z = np.array(theta[3:])
gW = gamma_z @ W
eps1 = EF - gamma_alpha * gW
if j == k:
eps2 = FF - (gamma_alpha**2)*gW - gW*sigma_xi2 - Sigma
eps3 = 0
else:
eps2 = 0
eps3 = FF - (gamma_alpha**2)*gW
h1 = [eps1 * Wk for Wk in W]
h2 = [eps2 * Wk for Wk in W]
h3 = [eps3 * Wk for Wk in W]
return np.concatenate([h1, h2, h3])
我需要为存储在 gmmarray
中的一系列值执行函数。具体来说,对于固定的 theta
.
gmmarray
的每一行的函数都是 运行 作为函数参数 X
我目前正在使用以下代码执行此操作:
import numpy as np
theta = [0.01, 1, 1, 0, 0]
gmmarray = np.random.random((1120451, 6))
test = np.apply_along_axis(h_Y1, 1, gmmarray, theta = init)
但是,这很慢 - 大约需要 19 秒。我尝试按如下方式矢量化函数:
Vh_Y1 = np.vectorize(h_Y1, signature = '(n),(j)->(i)')
test1 = Vh_Y1(gmmarray, init)
但是,这仍然需要 16 秒。我是不是做错了什么,还是有办法进一步加快速度?
非常感谢!
您可以将完整的 gmmarray
作为 X
参数传递。然后,不用遍历 gmmarray
的每一行,您可以对其列使用向量化操作。
像这样:
def h_Y1_vectorized(X, theta):
EF, FF, W, j, k = np.hsplit(X, [1,2,4,5]) # Column vectors (except W)
Sigma, sigma_xi2, gamma_alpha, *gamma_z = theta
gW = (W @ gamma_z)[:, None] # Ensure column vector
ga_gW = gamma_alpha * gW
FF_ga2gW = FF - gamma_alpha * ga_gW
eps1 = EF - ga_gW
j_equal_k = j == k
eps2 = np.where(j_equal_k, FF_ga2gW - gW * sigma_xi2 - Sigma, 0)
eps3 = np.where(j_equal_k, 0, FF_ga2gW)
h1 = eps1 * W
h2 = eps2 * W
h3 = eps3 * W
return np.hstack([h1, h2, h3])
通话中
>>> h_Y1_vectorized(gmmarray, theta)
产生相同的结果,速度大约提高 100 倍。