NumPy中如何加速计算很多小的协方差?
How to speed up the calculation of a lot of small covariance in NumPy?
是否可以在 NumPy 中加快小的协方差计算?函数“diff_cov_ridge”在我的程序中被调用了数百万次。
“theta”是标量,而“tx”、“ty”、“img1”、“ix1”、“iy1”、“x1”、“y1”、“img2”、“ix2”、“ iy2", "x2", "y2" 是长度为 n 的向量。
def cov(a, b):
return np.cov(a, b)[0, 1]
def diff_cov_ridge(theta, tx, ty, img1, ix1, iy1, x1, y1, img2, ix2, iy2, x2, y2):
ct = np.cos(theta)
st = np.sin(theta)
eq1 = cov(img1, ix2*x2)
eq2 = cov(img1, ix2*y2)
eq3 = cov(img1, iy2*x2)
eq4 = cov(img1, iy2*y2)
eq5 = cov(img2, ix1*x1)
eq6 = cov(img2, ix1*y1)
eq7 = cov(img2, iy1*x1)
eq8 = cov(img2, iy1*y1)
eq9 = cov(ix2, ix1*tx*x1)
eq10 = cov(ix1, ix2*tx*x2)
eq11 = cov(ix1*y1, ix2*tx)
eq12 = cov(ix1, ix2*tx*y2)
eq13 = cov(ix1*x1, ix2*x2)
eq14 = cov(ix1*x1, ix2*y2)
eq15 = cov(ix1*y1, ix2*x2)
eq16 = cov(ix1*y1, ix2*y2)
eq17 = cov(ix1, iy2*tx*x2)
eq18 = cov(ix1, iy2*tx*y2)
eq19 = cov(ix1*x1, iy2*ty)
eq20 = cov(ix1*y1, iy2*ty)
eq21 = cov(ix1*x1, iy2*x2)
eq22 = cov(ix1*x1, iy2*y2)
eq23 = cov(ix1*y1, iy2*x2)
eq24 = cov(ix1*y1, iy2*y2)
eq25 = cov(ix2, iy1*tx*x1)
eq26 = cov(ix2, iy1*tx*y1)
eq27 = cov(iy1, ix2*ty*x2)
eq28 = cov(iy1, ix2*ty*y2)
eq29 = cov(ix2*x2, iy1*x1)
eq30 = cov(ix2*y2, iy1*x1)
eq31 = cov(ix2*x2, iy1*y1)
eq32 = cov(ix2*y2, iy1*y1)
eq33 = cov(iy1*x1, iy2*ty)
eq34 = cov(iy1, iy2*ty*x2)
eq35 = cov(iy1*y1, iy2*ty)
eq36 = cov(iy1, iy2*ty*y2)
eq37 = cov(iy1*x1, iy2*x2)
eq38 = cov(iy1*x1, iy2*y2)
eq39 = cov(iy1*y1, iy2*x2)
eq40 = cov(iy1*y1, iy2*y2)
np.cov(a, b)[0, 1]
的定义就是
np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)
因此,您可以避免对角元素的计算和对 2x2 矩阵的索引,这应该可以将计算速度提高 1.5 到 3 倍。稍微快一点的公式是
np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)
这是一个非常小的 (a.size == 10
) 数组的非正式计时测试,显示了差异:
%timeit np.cov(a, b)[0, 1]
39.3 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)
23.7 µs ± 370 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)
18 µs ± 83.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
我强烈怀疑使用上面的公式,你可以预先计算一些你需要避免多次调用 cov
的数量。
您可以像处理方差一样分解协方差的计算:
((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)
这提供了 2 倍以上加速的额外因素:
%timeit ((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)
8.03 µs ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
这里的额外优势是您可以预先计算许多总和。例如,img1
出现在您的 4 个等式中,但您只需为所有这些计算一次 img1.sum()
。
是否可以在 NumPy 中加快小的协方差计算?函数“diff_cov_ridge”在我的程序中被调用了数百万次。
“theta”是标量,而“tx”、“ty”、“img1”、“ix1”、“iy1”、“x1”、“y1”、“img2”、“ix2”、“ iy2", "x2", "y2" 是长度为 n 的向量。
def cov(a, b):
return np.cov(a, b)[0, 1]
def diff_cov_ridge(theta, tx, ty, img1, ix1, iy1, x1, y1, img2, ix2, iy2, x2, y2):
ct = np.cos(theta)
st = np.sin(theta)
eq1 = cov(img1, ix2*x2)
eq2 = cov(img1, ix2*y2)
eq3 = cov(img1, iy2*x2)
eq4 = cov(img1, iy2*y2)
eq5 = cov(img2, ix1*x1)
eq6 = cov(img2, ix1*y1)
eq7 = cov(img2, iy1*x1)
eq8 = cov(img2, iy1*y1)
eq9 = cov(ix2, ix1*tx*x1)
eq10 = cov(ix1, ix2*tx*x2)
eq11 = cov(ix1*y1, ix2*tx)
eq12 = cov(ix1, ix2*tx*y2)
eq13 = cov(ix1*x1, ix2*x2)
eq14 = cov(ix1*x1, ix2*y2)
eq15 = cov(ix1*y1, ix2*x2)
eq16 = cov(ix1*y1, ix2*y2)
eq17 = cov(ix1, iy2*tx*x2)
eq18 = cov(ix1, iy2*tx*y2)
eq19 = cov(ix1*x1, iy2*ty)
eq20 = cov(ix1*y1, iy2*ty)
eq21 = cov(ix1*x1, iy2*x2)
eq22 = cov(ix1*x1, iy2*y2)
eq23 = cov(ix1*y1, iy2*x2)
eq24 = cov(ix1*y1, iy2*y2)
eq25 = cov(ix2, iy1*tx*x1)
eq26 = cov(ix2, iy1*tx*y1)
eq27 = cov(iy1, ix2*ty*x2)
eq28 = cov(iy1, ix2*ty*y2)
eq29 = cov(ix2*x2, iy1*x1)
eq30 = cov(ix2*y2, iy1*x1)
eq31 = cov(ix2*x2, iy1*y1)
eq32 = cov(ix2*y2, iy1*y1)
eq33 = cov(iy1*x1, iy2*ty)
eq34 = cov(iy1, iy2*ty*x2)
eq35 = cov(iy1*y1, iy2*ty)
eq36 = cov(iy1, iy2*ty*y2)
eq37 = cov(iy1*x1, iy2*x2)
eq38 = cov(iy1*x1, iy2*y2)
eq39 = cov(iy1*y1, iy2*x2)
eq40 = cov(iy1*y1, iy2*y2)
np.cov(a, b)[0, 1]
的定义就是
np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)
因此,您可以避免对角元素的计算和对 2x2 矩阵的索引,这应该可以将计算速度提高 1.5 到 3 倍。稍微快一点的公式是
np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)
这是一个非常小的 (a.size == 10
) 数组的非正式计时测试,显示了差异:
%timeit np.cov(a, b)[0, 1]
39.3 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)
23.7 µs ± 370 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)
18 µs ± 83.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
我强烈怀疑使用上面的公式,你可以预先计算一些你需要避免多次调用 cov
的数量。
您可以像处理方差一样分解协方差的计算:
((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)
这提供了 2 倍以上加速的额外因素:
%timeit ((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)
8.03 µs ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
这里的额外优势是您可以预先计算许多总和。例如,img1
出现在您的 4 个等式中,但您只需为所有这些计算一次 img1.sum()
。