Numpy:在各种索引的矩阵行中插入任意数量的零

Numpy: Insert arbitrary number of zeros into matrix rows at various indices

问题

我有一个二维数组,其中包含一系列 0 和 1,它们表示经过位压缩的值。我需要在每一行的任意点插入任意数量的 0,以便将位压缩值填充为 8 位的倍数。

我有 3 个向量。

  1. 包含索引的向量,我想在
  2. 处插入零
  3. 一个向量,其中包含我要在向量 1 的每个点插入的零的数量。
  4. 一个向量,其中包含我要填充的每个位串的大小。 (可能不需要这个来解决,但它可能很有趣!)

例子

我有一个向量,其中包含要插入的索引:[0 6 14]

和一个包含我要插入的零数的向量:[2 0 4]

和一个向量,它具有我正在填充的每个位串的大小:[6, 8, 4]

目的是将零插入数组的每一行:

[[0 0 0 0 0 1  0 0 0 0 0 0 0 1  0 0 0 1]
 [0 0 0 0 0 1  0 0 0 0 0 0 1 0  0 0 0 1]
 [0 0 0 0 1 0  0 0 0 0 0 0 1 0  0 0 1 0]
 [0 0 0 0 1 1  0 0 0 0 0 1 0 0  0 0 1 1]
 [0 0 0 1 0 0  0 0 0 0 0 1 0 0  0 1 0 0]
 [0 0 0 1 0 1  0 0 0 0 0 1 1 0  0 1 0 1]
 [0 0 0 1 1 0  0 0 0 0 0 1 1 0  0 1 1 0]
 [0 0 0 1 1 1  0 0 0 0 1 0 0 0  0 1 1 1]
 [0 0 1 0 0 0  0 0 0 0 1 0 0 0  1 0 0 0]
 [1 1 0 0 1 0  1 1 1 1 1 1 1 1  1 0 0 1]]

*Spaces added between columns to highlight insertion points.

变为:

  | |                               | | | |
  v v                               v v v v
[[0 0 0 0 0 0 0 1  0 0 0 0 0 0 0 1  0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 1  0 0 0 0 0 0 1 0  0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0  0 0 0 0 0 0 1 0  0 0 0 0 0 0 1 0]
 [0 0 0 0 0 0 1 1  0 0 0 0 0 1 0 0  0 0 0 0 0 0 1 1]
 [0 0 0 0 0 1 0 0  0 0 0 0 0 1 0 0  0 0 0 0 0 1 0 0]
 [0 0 0 0 0 1 0 1  0 0 0 0 0 1 1 0  0 0 0 0 0 1 0 1]
 [0 0 0 0 0 1 1 0  0 0 0 0 0 1 1 0  0 0 0 0 0 1 1 0]
 [0 0 0 0 0 1 1 1  0 0 0 0 1 0 0 0  0 0 0 0 0 1 1 1]
 [0 0 0 0 1 0 0 0  0 0 0 0 1 0 0 0  0 0 0 0 1 0 0 0]
 [0 0 1 1 0 0 1 0  1 1 1 1 1 1 1 1  0 0 0 0 1 0 0 1]]

*Arrows denote inserted 0's

我正在尝试执行此操作的最高效方法。所有 vectors/arrays 都是 numpy 数组。我研究过使用 numpy.insert 但它似乎无法在给定索引处插入多个值。我也考虑过使用 numpy.hstack 然后展平,但无法产生我想要的结果。

非常感谢任何帮助!

我的方法是预先创建一个零数组并将列复制到正确的位置。索引在清晰度方面有点毛茸茸,因此可能还有改进的余地。


data = np.array(
  [[0, 0, 0, 0, 0, 1,  0, 0, 0, 0, 0, 0, 0, 1,  0, 0, 0, 1],
   [0, 0, 0, 0, 0, 1,  0, 0, 0, 0, 0, 0, 1, 0,  0, 0, 0, 1],
   [0, 0, 0, 0, 1, 0,  0, 0, 0, 0, 0, 0, 1, 0,  0, 0, 1, 0],
   [0, 0, 0, 0, 1, 1,  0, 0, 0, 0, 0, 1, 0, 0,  0, 0, 1, 1],
   [0, 0, 0, 1, 0, 0,  0, 0, 0, 0, 0, 1, 0, 0,  0, 1, 0, 0],
   [0, 0, 0, 1, 0, 1,  0, 0, 0, 0, 0, 1, 1, 0,  0, 1, 0, 1],
   [0, 0, 0, 1, 1, 0,  0, 0, 0, 0, 0, 1, 1, 0,  0, 1, 1, 0],
   [0, 0, 0, 1, 1, 1,  0, 0, 0, 0, 1, 0, 0, 0,  0, 1, 1, 1],
   [0, 0, 1, 0, 0, 0,  0, 0, 0, 0, 1, 0, 0, 0,  1, 0, 0, 0],
   [1, 1, 0, 0, 1, 0,  1, 1, 1, 1, 1, 1, 1, 1,  1, 0, 0, 1]])
insert_before = [0, 6, 14]
zero_pads = [0, 2, 4]

res = np.zeros((len(data), 8*len(zero_pads)), dtype=int)  

for i in range(len(zero_pads)):
    res[:, i*8+zero_pads[i]:(i+1)*8] = data[:, insert_before[i]:insert_before[i]+8-zero_pads[i]]


>>> res
array([[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1],
       [0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0],
       [0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1]])

为您格式化了矩阵(尽管使用人为的示例可能更容易):

matrix = nparray([[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
                  [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1],
                  [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0],
                  [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1],
                  [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
                  [0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1],
                  [0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0],
                  [0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1],
                  [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0],
                  [1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1]])
indices = np.array([0, 6, 14])
num_zeros = np.array([2, 0, 4])
pad = np.array([6, 8, 4])

您需要分配一个新数组来执行此操作。在 numpy 中创建零填充数组非常便宜。因此,让我们从分配一个零填充数组开始,该数组具有我们想要的输出形状:

out_shape = np.array(matrix.shape)
out_shape[1] += num_zeros.sum()
zeros = np.zeros(out_shape, dtype=matrix.dtype)

现在,使用切片将matrix写入zeros中连续的内存块:

meta = np.stack([indices, num_zeros])
meta = meta[:, meta[1] != 0] # throw away 0 slices
slices = meta.T.ravel().cumsum()
slices = np.append(cs, zeros.shape[1]) # for convenience

prev = 0
for start, end in zip(slices[1::2], slices[2::2]):
    zeros[:, slice(start,end)] = matrix[:, slice(prev, prev + end-start)]
    prev = end-start

zeros中的输出:

[[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0]
 [0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1]
 [0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1]
 [0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0]
 [0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1]
 [0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0]
 [0 0 1 1 0 0 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 1]]

np.insert 确实支持在同一索引处插入多个值,您只需多次提供该索引即可。这样您就可以获得您想要的结果如下:

indices = np.array([0, 6, 14])
n_zeros = np.array([2, 0, 4])

result = np.insert(matrix,
                   np.repeat(indices, n_zeros),
                   0,
                   axis=1)