解释（和比较）numpy.correlate 的输出

Question

我看过 this question 但它并没有真正给我任何答案。

本质上，如何使用 np.correlate 确定是否存在强相关性？我期望使用我可以理解的 coeff 选项从 matlab xcorr 获得相同的输出（1 是滞后 l 的强相关，0 是滞后 [=17 的无相关性=])，但 np.correlate 产生大于 1 的值，即使输入向量已在 0 和 1 之间归一化。

示例输入

import numpy as np
x = np.random.rand(10)
y = np.random.rand(10)

np.correlate(x, y, 'full')

这给出了以下输出：

array([ 0.15711279,  0.24562736,  0.48078652,  0.69477838,  1.07376669,
    1.28020871,  1.39717118,  1.78545567,  1.85084435,  1.89776181,
    1.92940874,  2.05102884,  1.35671247,  1.54329503,  0.8892999 ,
    0.67574802,  0.90464743,  0.20475408,  0.33001517])

如果我不知道最大可能的相关值是什么，我怎么知道什么是强相关，什么是弱相关？

另一个例子：

In [10]: x = [0,1,2,1,0,0]

In [11]: y = [0,0,1,2,1,0]

In [12]: np.correlate(x, y, 'full')
Out[12]: array([0, 0, 1, 4, 6, 4, 1, 0, 0, 0, 0])

编辑： 这是一个糟糕的问题，但标记的答案确实回答了所问的问题。我认为重要的是要注意我在这个领域挖掘时发现的东西，你不能比较互相关的输出。换句话说，使用互相关的输出来说明信号 x 与信号 y 的相关性优于信号 z。互相关不提供此类信息

Answer 1

numpy.correlate 已记录。不过，我认为我们可以理解它。让我们从您的示例案例开始：

>>> import numpy as np
>>> x = [0,1,2,1,0,0]
>>> y = [0,0,1,2,1,0]
>>> np.correlate(x, y, 'full')
array([0, 0, 1, 4, 6, 4, 1, 0, 0, 0, 0])

这些数字是每个可能的滞后的互相关。为了更清楚地说明这一点，让我们将滞后数放在相关性之上：

>>> np.concatenate((np.arange(-5, 6)[None,...], np.correlate(x, y, 'full')[None,...]), axis=0)
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [ 0,  0,  1,  4,  6,  4,  1,  0,  0,  0,  0]])

在这里，我们可以看到互相关在滞后-1 时达到峰值。如果您查看上面的 x 和 y，这是有道理的：它向左移动 y 一个位置，它与 x 完全匹配。

为了验证这一点，让我们再试一次，这次将 y 移动得更远：

>>> y = [0, 0, 0, 0, 1, 2]
>>> np.concatenate((np.arange(-5, 6)[None,...], np.correlate(x, y, 'full')[None,...]), axis=0)
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [ 0,  2,  5,  4,  1,  0,  0,  0,  0,  0,  0]])

现在，相关性在滞后 -3 时达到峰值，这意味着 x 和 y 之间的最佳匹配发生在 y 向左移动 3 位时。

解释（和比较）numpy.correlate 的输出

Interpreting (and comparing) output from numpy.correlate

python

matlab

numpy

cross-correlation