具有可变标准偏差内核的卷积数组

Question

祝你程序员好！
今天我想做一些我认为很棘手的事情。我有一个名为 tac 的非常大的二维数组，它基本上包含时间曲线值和一个包含名为 coor 的坐标元组的文件，该文件包含有关将这些曲线放置在 3D 数组中何处的信息。这组变量代表的其实是一个4维数组：前3维代表space维，第四维代表时间。整个事情都按原样存储，以避免存储大量的零。

我想对这组数据每次（换句话说，第 4 维中的每个值）应用高斯核。我能够生成这个内核并使用 scipy.ndimage.convolve 很容易地为整个数组执行卷积以获得固定的标准偏差。内核是使用 scipy.signal.gaussian 创建的。这里简单举例说明tac_4d包含4维数组的原理（存储了很多我知道的数据……但当时有一个问题）：

def gaussian_kernel_3d(radius, sigma):
    num = 2 * radius + 1
    kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
    kernel_2d = np.outer(kernel_1d, kernel_1d)
    kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
    kernel_3d = np.expand_dims(kernel_3d, -1)
    return kernel_3d

g = gaussian_kernel_3d(1, .5)
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0)

现在的诀窍是用内核对数组进行卷积，每个 SPACE 坐标的标准差都不同。换句话说，我会有一个 3D 数组 std，其中包含数组每个坐标的标准差。

似乎 https://github.com/sheliak/varconvolve 是解决此问题所需的代码。但是我真的不明白如何使用它，坦率地说，我更愿意提出一个真正的解决方案。你们有没有办法解决这个问题？

提前致谢！

编辑

下面是我希望可以考虑的MCVE

import numpy as np
from scipy import signal
from scipy import ndimage as nd


def gaussian_kernel_2d(radius, sigma):
    num = 2 * radius + 1
    kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
    kernel_2d = np.outer(kernel_1d, kernel_1d)
    return kernel_2d


def gaussian_kernel_3d(radius, sigma):
    num = 2 * radius + 1
    kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
    kernel_2d = np.outer(kernel_1d, kernel_1d)
    kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
    kernel_3d = np.expand_dims(kernel_3d, -1)
    return kernel_3d


np.random.seed(0)

number_of_tac = 150
time_samples = 915
z, y, x = 100, 150, 100
voxel_number = x * y * z

# TACs in the right order
tac = np.random.uniform(0, 4, time_samples * number_of_tac).reshape(number_of_tac, time_samples)

arr = np.array([0] * (voxel_number - number_of_tac) + [1] * number_of_tac)
np.random.shuffle(arr)
arr = arr.reshape(z, y, x)
coor = np.where(arr != 0)  # non-empty voxel

# Algorithm to replace TAC in 3D space
nnz = np.zeros(arr.shape)
nnz[coor] = 1
tac_4d = np.zeros((x, y, z, time_samples))
tac_4d[np.where(nnz == 1)] = tac

# 3D convolution for all time
# TODO: find a way to make standard deviation change for each voxel
g = gaussian_kernel_3d(1, 1)  # 3D kernel of std = 1
v = np.random.uniform(0, 1, x * y * z).reshape(z, y, x)  # 3D array of std
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0)  # convolution

Answer 1

本质上，您有一个 4D 数据集，形状 (nx, ny, nz, nt) 在 (nx, ny, nz) 轴上稀疏，在 nt 轴上密集。如果 (i, j, k) 是稀疏维度中非零点的坐标，您希望与具有依赖于 (i, j, k).

的 sigma 的高斯 3D 内核进行卷积

例如，如果在 [1, 2, 5] 和 [1, 4, 5] 处有非零点，对应的 sigmas 0.1 和 1.0，则坐标 [1, 3, 5] 处的输出主要受 [1, 4, 5] 点的影响，因为该点的点差最大。

你的问题有歧义；这也可能意味着点 [1, 3, 5] 有自己的相关西格玛，例如 0.5，并从两个相邻点以相同的权重提取数据。我将采用第一个定义（与输入点相关联的西格玛值，而不是与输出点相关联）。

因为运算不是真正的卷积，所以没有快速的基于 FFT 的方法在一次运算中完成整个运算。相反，您必须遍历 sigma 值。幸运的是，您的示例只有 150 个非零点，因此循环并不太昂贵。

这是一个实现。我尽可能长时间地以稀疏表示形式保存数据。

import scipy.signal
import numpy as np

def kernel3d(mm, sigma):
    """Return (mm, mm, mm) shaped, normalized kernel."""
    g1 = scipy.signal.gaussian(mm, std=sigma)
    g3 = g1.reshape(mm, 1, 1) * g1.reshape(1, mm, 1) * g1.reshape(1, 1, mm)    
    return g3 * (1/g3.sum())

np.random.seed(1)
s = 2 # scaling factor (original problem: s=10)
nx, ny, nz, nt, nnz = 10*s, 11*s, 12*s, 91*s, 15*s
# select nnz random voxels to fill with time series data
randint = np.random.randint
tseries = {} # key: (i, j, k) tuple; value: time series data, shape (nt,)
for _ in range(nnz):
    while True:
        ijk = (randint(nx), randint(ny), randint(nz))
        if ijk not in tseries:
            tseries[ijk] = np.random.uniform(0, 1, size=nt)
            break
        
ijks = np.array(list(tseries.keys())) # shape (nnz, 3)

# sigmas: key: (i, j, k) tuple; value: standard deviation
sigmas = { k: np.random.uniform(0, 2) for k in tseries.keys() }

# output will be stored as dense array, padded to avoid edge issues 
# with convolution.
m = 5 # padding size
cag_4dp = np.zeros((nx+2*m, ny+2*m, nz+2*m, nt))

mm = 2*m + 1 # kernel width
for (i, j, k), tdata in tseries.items():
    kernel = kernel3d(mm, sigmas[(i, j, k)]).reshape(mm, mm, mm, 1)
    # convolution of one voxel by kernel is trivial.
    # slice4d_c has shape (mm, mm, mm, nt).
    slice4d_c = kernel * tdata
    cag_4dp[i:i+mm, j:j+mm, k:k+mm, :] += slice4d_c

cag_4d = cag_4dp[m:-m, m:-m, m:-m, :]

#%%

import matplotlib.pyplot as plt
fig, axs = plt.subplots(2, 2, tight_layout=True)
plt.close('all')
# find a few planes
#ks = np.where(np.any(cag_4d != 0, axis=(0, 1,3)))[0]
ks = ijks[:4, 2]
for ax, k in zip(axs.ravel(), ks):
    ax.imshow(cag_4d[:, :, k, nt//2].T)
    ax.set_title(f'Voxel [:, :, {k}] at time {nt//2}')

fig.show()

for ijk, sigma in sigmas.items():
    print(f'{ijk}: sigma={sigma:.2f}')

具有可变标准偏差内核的卷积数组

Convolve array with kernel of variable standard deviation

python

numpy

convolution

scipy