
Basics of Normalizing Cross-Correlation with a View to Comparing Signals

我正在尝试了解如何使用互相关来确定两个信号的相似性。 This tutorial offers a very clear explanation of the basics, but I still don't understand how to use normalization effectively to prevent strong signals from dominating the cross-correlation measure when you have signals with different energy levels. The same tutor, David Dorran, discusses the issue of normalization here,并解释了如何使用点积对相关性进行归一化,但我仍有一些疑问。

我写了这个 python 例程来对来自一组信号的每对信号进行互相关:

import numpy as np
import pandas as pd

def mycorrelate2d(df, normalized=False):
    # initialize cross correlation matrix with zeros
    ccm = np.zeros(shape=df.shape, dtype=list)
    for i, row_dict1 in enumerate(
        outer_row = list(row_dict1.values())    
        for j, row_dict2 in enumerate(
            inner_row = list(row_dict2.values())   
            x = np.correlate(inner_row, outer_row)
            if normalized:
                n = np.dot(inner_row, outer_row)                
                x = x / n
            ccm[i][j] = x
    return ccm

假设我有 3 个幅度递增的信号: [1、2、3]、[4、5、6] 和 [7、8、9]

我想将这三个信号互相关以查看哪些对相似,但是当我将这 3 个信号传递到我编写的例程中时,我似乎没有得到相似性的度量。互相关值的大小只是能量信号的函数。时期。甚至信号 与自身 的互相关产生的值也低于同一信号与另一个更高能量信号的互相关。

df_x3 = pd.DataFrame(
        np.array([[1, 2, 3], 
                  [4, 5, 6], 
                  [7, 8, 9]]).reshape(3, -1))


array([[array([ 3,  8, 14,  8,  3]), 
        array([12, 23, 32, 17,  6]),
        array([21, 38, 50, 26,  9])],
       [array([ 6, 17, 32, 23, 12]), 
        array([24, 50, 77, 50, 24]),
        array([ 42,  83, 122,  77,  36])],
       [array([ 9, 26, 50, 38, 21]), 
        array([ 36,  77, 122,  83,  42]),
        array([ 63, 128, 194, 128,  63])]], dtype=object)

现在,我传入相同的 3 个信号,但这次我表示我想要标准化结果:

mycorrelate2d(df_x3, normalized=True)


array([[array([ 0.2142, 0.5714,  1., 0.5714, 0.2142]),
        array([ 0.375,  0.71875, 1., 0.5312, 0.1875]),
        array([ 0.42,   0.76,    1., 0.52,   0.18])],
       [array([ 0.1875, 0.5312,  1., 0.7187, 0.375]),
        array([ 0.3116, 0.6493,  1., 0.6493, 0.3116]),
        array([ 0.3442, 0.6803,  1., 0.6311, 0.2950])],
       [array([ 0.18,   0.52,    1., 0.76,   0.42]),
        array([ 0.2950, 0.6311,  1., 0.6803, 0.3442]),
        array([ 0.3247, 0.6597,  1., 0.6597, 0.3247])]],



所以您用于标准化的公式不太正确。归一化发生在我们在 NCC 中进行关联之前,然后我们将答案除以向量长度,如维基百科公式 https://en.wikipedia.org/wiki/Cross-correlation#Zero-normalized_cross-correlation_(ZNCC)



import numpy as np

def mycorrelate2d(df, normalized=False):
    # initialize cross correlation matrix with zeros
    ccm = np.zeros((3,3))
    for i in range(3):
        outer_row = df[i][:]
        for j in range(3):
            inner_row = df[j][:]
            if(not normalized):
                x = np.correlate(inner_row, outer_row)
                a = (inner_row-np.mean(inner_row))/(np.std(inner_row)*len(inner_row))
                b = (outer_row-np.mean(outer_row))/(np.std(outer_row))
                x = np.correlate(a,b)
            ccm[i][j] = x
    return ccm

df_x3 =np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]]).reshape(3, -1)
df_x3 =np.array([[1, 2, 3],
                  [9, 5, 6],
                  [74, 8, 9]]).reshape(3, -1)


[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
[[ 1.         -0.72057669 -0.85933941]
 [-0.72057669  1.          0.97381599]
 [-0.85933941  0.97381599  1.        ]]