numpy 互相关 - 矢量化
numpy cross-correlation - vectorizing
我有大量的互相关要计算,我正在寻找最快的计算方法。我假设向量化问题会有所帮助,而不是用循环来解决
我有一个标记为电极 x 时间点 x 试验的 3D 阵列(形状:64x256x913)。我想为每次试验计算每对电极的时间点的最大互相关。
具体来说:对于每次试验,我都想采用每对电极组合并计算每对电极的最大互相关值。这将在单个 row/vector 中产生 4096 (64*64) 个最大互相关值。这将在每次试验中完成,将每个 rows/vectors 堆叠在一起,产生最终的形状为 913*4096 的二维数组,其中包含最大互相关值
那是很多计算,但我想尝试找到最快的方法来完成它。我使用列表作为容器模拟了一些原型代码,这可能有助于更好地解释问题。那里可能有一些逻辑错误,但无论哪种方式,代码都不会 运行 在我的计算机上,因为要计算的东西太多以至于 python 只是冻结。在这里:
#allData is a 64x256x913 array
all_experiment_trials = []
for trial in range(allData.shape[2]):
all_trial_electrodes = []
for electrode in range(allData.shape[0]):
for other_electrode in range(allData.shape[0]):
if electrode == other_electrode:
pass
else:
single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial], "full"))
all_trial_electrodes.append(single_xcorr)
all_experiment_trials.append(all_trial_electrodes)
显然循环对于这种类型的东西来说真的很慢。是否有使用 numpy 数组的矢量化解决方案?
我已经检查了 correlate2d() 之类的东西,但我不认为它们真的适用于我的情况,因为我没有将 2 个矩阵相乘
这是一种基于 np.einsum
-
的向量化方法
def vectorized_approach(allData):
# Get shape
M,N,R = allData.shape
# Valid mask based on condition: "if electrode == other_electrode"
valid_mask = np.mod(np.arange(M*M),M+1)!=0
# Elementwise multiplications across all elements in axis=0,
# and then summation along axis=1
out = np.einsum('ijkl,ijkl->lij',allData[:,None,:,:],allData[None,:,:,:])
# Use valid mask to skip columns and have the final output
return out.reshape(R,-1)[:,valid_mask]
运行时测试验证结果-
In [10]: allData = np.random.rand(20,80,200)
In [11]: def org_approach(allData):
...: all_experiment_trials = []
...: for trial in range(allData.shape[2]):
...: all_trial_electrodes = []
...: for electrode in range(allData.shape[0]):
...: for other_electrode in range(allData.shape[0]):
...: if electrode == other_electrode:
...: pass
...: else:
...: single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial]))
...: all_trial_electrodes.append(single_xcorr)
...: all_experiment_trials.append(all_trial_electrodes)
...: return all_experiment_trials
...:
In [12]: %timeit org_approach(allData)
1 loops, best of 3: 1.04 s per loop
In [13]: %timeit vectorized_approach(allData)
100 loops, best of 3: 15.1 ms per loop
In [14]: np.allclose(vectorized_approach(allData),np.asarray(org_approach(allData)))
Out[14]: True
我有大量的互相关要计算,我正在寻找最快的计算方法。我假设向量化问题会有所帮助,而不是用循环来解决
我有一个标记为电极 x 时间点 x 试验的 3D 阵列(形状:64x256x913)。我想为每次试验计算每对电极的时间点的最大互相关。
具体来说:对于每次试验,我都想采用每对电极组合并计算每对电极的最大互相关值。这将在单个 row/vector 中产生 4096 (64*64) 个最大互相关值。这将在每次试验中完成,将每个 rows/vectors 堆叠在一起,产生最终的形状为 913*4096 的二维数组,其中包含最大互相关值
那是很多计算,但我想尝试找到最快的方法来完成它。我使用列表作为容器模拟了一些原型代码,这可能有助于更好地解释问题。那里可能有一些逻辑错误,但无论哪种方式,代码都不会 运行 在我的计算机上,因为要计算的东西太多以至于 python 只是冻结。在这里:
#allData is a 64x256x913 array
all_experiment_trials = []
for trial in range(allData.shape[2]):
all_trial_electrodes = []
for electrode in range(allData.shape[0]):
for other_electrode in range(allData.shape[0]):
if electrode == other_electrode:
pass
else:
single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial], "full"))
all_trial_electrodes.append(single_xcorr)
all_experiment_trials.append(all_trial_electrodes)
显然循环对于这种类型的东西来说真的很慢。是否有使用 numpy 数组的矢量化解决方案?
我已经检查了 correlate2d() 之类的东西,但我不认为它们真的适用于我的情况,因为我没有将 2 个矩阵相乘
这是一种基于 np.einsum
-
def vectorized_approach(allData):
# Get shape
M,N,R = allData.shape
# Valid mask based on condition: "if electrode == other_electrode"
valid_mask = np.mod(np.arange(M*M),M+1)!=0
# Elementwise multiplications across all elements in axis=0,
# and then summation along axis=1
out = np.einsum('ijkl,ijkl->lij',allData[:,None,:,:],allData[None,:,:,:])
# Use valid mask to skip columns and have the final output
return out.reshape(R,-1)[:,valid_mask]
运行时测试验证结果-
In [10]: allData = np.random.rand(20,80,200)
In [11]: def org_approach(allData):
...: all_experiment_trials = []
...: for trial in range(allData.shape[2]):
...: all_trial_electrodes = []
...: for electrode in range(allData.shape[0]):
...: for other_electrode in range(allData.shape[0]):
...: if electrode == other_electrode:
...: pass
...: else:
...: single_xcorr = max(np.correlate(allData[electrode,:,trial], allData[other_electrode,:,trial]))
...: all_trial_electrodes.append(single_xcorr)
...: all_experiment_trials.append(all_trial_electrodes)
...: return all_experiment_trials
...:
In [12]: %timeit org_approach(allData)
1 loops, best of 3: 1.04 s per loop
In [13]: %timeit vectorized_approach(allData)
100 loops, best of 3: 15.1 ms per loop
In [14]: np.allclose(vectorized_approach(allData),np.asarray(org_approach(allData)))
Out[14]: True