向量化向量值函数

Vectorizing a Vector-Valued Function

我有以下功能:

def h_Y1(X, theta):
    EF = X[0]
    FF = X[1]
    j = X[-2]
    k = X[-1]
    W = X[2:-2]
    Sigma = theta[0]
    sigma_xi2 = theta[1]
    gamma_alpha = theta[2]
    gamma_z = np.array(theta[3:])
    gW = gamma_z @ W
    eps1 = EF -  gamma_alpha * gW
    if j == k:
        eps2 = FF - (gamma_alpha**2)*gW - gW*sigma_xi2 - Sigma
        eps3 = 0
    else:
        eps2 = 0
        eps3 = FF - (gamma_alpha**2)*gW
    h1 = [eps1 * Wk for Wk in W]
    h2 = [eps2 * Wk for Wk in W]
    h3 = [eps3 * Wk for Wk in W]
    return np.concatenate([h1, h2, h3])

我需要为存储在 gmmarray 中的一系列值执行函数。具体来说,对于固定的 theta.

,我希望 gmmarray 的每一行的函数都是 运行 作为函数参数 X

我目前正在使用以下代码执行此操作:

import numpy as np
theta = [0.01, 1, 1, 0, 0]
gmmarray = np.random.random((1120451, 6))
test = np.apply_along_axis(h_Y1, 1, gmmarray, theta = init)

但是,这很慢 - 大约需要 19 秒。我尝试按如下方式矢量化函数:

Vh_Y1 = np.vectorize(h_Y1, signature = '(n),(j)->(i)')
test1 = Vh_Y1(gmmarray, init)

但是,这仍然需要 16 秒。我是不是做错了什么,还是有办法进一步加快速度?

非常感谢!

您可以将完整的 gmmarray 作为 X 参数传递。然后,不用遍历 gmmarray 的每一行,您可以对其列使用向量化操作。

像这样:

def h_Y1_vectorized(X, theta):
    EF, FF, W, j, k = np.hsplit(X, [1,2,4,5])         # Column vectors (except W)
    Sigma, sigma_xi2, gamma_alpha, *gamma_z = theta

    gW = (W @ gamma_z)[:, None]                       # Ensure column vector
    ga_gW = gamma_alpha * gW
    FF_ga2gW = FF - gamma_alpha * ga_gW

    eps1 = EF - ga_gW
    j_equal_k = j == k
    eps2 = np.where(j_equal_k, FF_ga2gW - gW * sigma_xi2 - Sigma, 0)
    eps3 = np.where(j_equal_k, 0, FF_ga2gW)

    h1 = eps1 * W
    h2 = eps2 * W
    h3 = eps3 * W

    return np.hstack([h1, h2, h3])

通话中

>>> h_Y1_vectorized(gmmarray, theta)

产生相同的结果,速度大约提高 100 倍。