NumPy中如何加速计算很多小的协方差?

How to speed up the calculation of a lot of small covariance in NumPy?

是否可以在 NumPy 中加快小的协方差计算?函数“diff_cov_ridge”在我的程序中被调用了数百万次。

“theta”是标量,而“tx”、“ty”、“img1”、“ix1”、“iy1”、“x1”、“y1”、“img2”、“ix2”、“ iy2", "x2", "y2" 是长度为 n 的向量。

def cov(a, b):
    return np.cov(a, b)[0, 1]

def diff_cov_ridge(theta, tx, ty, img1, ix1, iy1, x1, y1, img2, ix2, iy2, x2, y2):

    ct = np.cos(theta)
    st = np.sin(theta)
    eq1 = cov(img1, ix2*x2)
    eq2 = cov(img1, ix2*y2)
    eq3 = cov(img1, iy2*x2)
    eq4 = cov(img1, iy2*y2)
    eq5 = cov(img2, ix1*x1)
    eq6 = cov(img2, ix1*y1)
    eq7 = cov(img2, iy1*x1)
    eq8 = cov(img2, iy1*y1)
    eq9 = cov(ix2, ix1*tx*x1)
    eq10 = cov(ix1, ix2*tx*x2)
    eq11 = cov(ix1*y1, ix2*tx)
    eq12 = cov(ix1, ix2*tx*y2)
    eq13 = cov(ix1*x1, ix2*x2)
    eq14 = cov(ix1*x1, ix2*y2)
    eq15 = cov(ix1*y1, ix2*x2)
    eq16 = cov(ix1*y1, ix2*y2)
    eq17 = cov(ix1, iy2*tx*x2)
    eq18 = cov(ix1, iy2*tx*y2)
    eq19 = cov(ix1*x1, iy2*ty)
    eq20 = cov(ix1*y1, iy2*ty)
    eq21 = cov(ix1*x1, iy2*x2)
    eq22 = cov(ix1*x1, iy2*y2)
    eq23 = cov(ix1*y1, iy2*x2)
    eq24 = cov(ix1*y1, iy2*y2)
    eq25 = cov(ix2, iy1*tx*x1)
    eq26 = cov(ix2, iy1*tx*y1)
    eq27 = cov(iy1, ix2*ty*x2)
    eq28 = cov(iy1, ix2*ty*y2)
    eq29 = cov(ix2*x2, iy1*x1)
    eq30 = cov(ix2*y2, iy1*x1)
    eq31 = cov(ix2*x2, iy1*y1)
    eq32 = cov(ix2*y2, iy1*y1)
    eq33 = cov(iy1*x1, iy2*ty)
    eq34 = cov(iy1, iy2*ty*x2)
    eq35 = cov(iy1*y1, iy2*ty)
    eq36 = cov(iy1, iy2*ty*y2)
    eq37 = cov(iy1*x1, iy2*x2)
    eq38 = cov(iy1*x1, iy2*y2)
    eq39 = cov(iy1*y1, iy2*x2)
    eq40 = cov(iy1*y1, iy2*y2)

np.cov(a, b)[0, 1]的定义就是

np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)

因此,您可以避免对角元素的计算和对 2x2 矩阵的索引,这应该可以将计算速度提高 1.5 到 3 倍。稍微快一点的公式是

np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)

这是一个非常小的 (a.size == 10) 数组的非正式计时测试,显示了差异:

%timeit np.cov(a, b)[0, 1]
39.3 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)
23.7 µs ± 370 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)
18 µs ± 83.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

我强烈怀疑使用上面的公式,你可以预先计算一些你需要避免多次调用 cov 的数量。


您可以像处理方差一样分解协方差的计算:

((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)

这提供了 2 倍以上加速的额外因素:

%timeit ((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)
8.03 µs ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

这里的额外优势是您可以预先计算许多总和。例如,img1 出现在您的 4 个等式中,但您只需为所有这些计算一次 img1.sum()