Numpy - 两个矩阵的行之间的协方差

Numpy - Covariance between row of two matrix

我需要计算两个不同矩阵的每一行之间的协方差,即第一个矩阵的第一行与第二个矩阵的第一行之间的协方差,依此类推,直到两个矩阵的最后一行。我可以在没有 NumPy 的情况下使用下面附带的代码来做到这一点,我的问题是:是否可以避免使用“for 循环”并使用 NumPy 获得相同的结果?

m1 = np.array([[1,2,3],[2,2,2]])
m2 = np.array([[2.56, 2.89, 3.76],[1,2,3.95]])

output = []
for a,b in zip(m1,m2):
    cov = np.cov(a, b)
    output.append(cov[0][1])
print(output)

提前致谢!

您可以使用列表理解而不是 for 循环,并且可以通过沿第三维连接两个数组来消除 zip(如果您愿意)。

import numpy as np

m1 = np.array([[1,2,3],[2,2,2]])
m2 = np.array([[2.56, 2.89, 3.76],[1,2,3.95]])

# List comprehension on zipped arrays.
out2 = [np.cov(a, b)[0][1] for a, b in zip(m1, m2)]
print(out2)
# [0.5999999999999999, 0.0]

# List comprehension on concatenated arrays.
big_array = np.concatenate((m1[:, np.newaxis, :],
                            m2[:, np.newaxis, :]), axis=1)

out3 = [np.cov(X)[0][1] for X in big_array]
print(out3)
# [0.5999999999999999, 0.0]

如果你要处理大数组,我会考虑这个:

from numba import jit
import numpy as np


m1 = np.random.rand(10000, 3)
m2 = np.random.rand(10000, 3)

@jit(nopython=True) 
def nb_cov(a, b): 
    return [np.cov(x)[0,1] for x in np.stack((a, b), axis=1)]

获得

的运行时间
>>> %timeit nb_cov(m1, m2)
The slowest run took 94.24 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 5: 10.5 ms per loop

相比
>>> %timeit [np.cov(x)[0,1] for x in np.stack((m1, m2), axis=1)]
1 loop, best of 5: 410 ms per loop