更改没有循环的 numpy 数组的格式

Question

我有一个形状为 a.shape = (1,k*d) 的 numpy 数组，我想将其转换为每列形状为 b.shape = (k*d,k) 的 numpy 数组

b[i,j] = a[i] if j<i+1

b[i,j] = 0 if not

例如：

k = 3
d= 2
**********

A =  |a|   =>  B =  |a 0 0|
     |b|            |b 0 0|
     |c|            |0 c 0|
     |d|            |0 d 0|
     |e|            |0 0 e|
     |f|            |0 0 f|

主要是，没有循环！

我正在寻找的是一系列能产生所需结果的 numpy 矩阵运算。

Answer 1

这重现了您的示例。可以推广到其他k和d

In [12]: a=np.arange(6)    
In [13]: b=np.zeros((6,3))
In [14]: b[np.arange(6),np.arange(3).repeat(2)]=a

In [15]: b
Out[15]: 
array([[ 0.,  0.,  0.],
       [ 1.,  0.,  0.],
       [ 0.,  2.,  0.],
       [ 0.,  3.,  0.],
       [ 0.,  0.,  4.],
       [ 0.,  0.,  5.]])

关键是重复必要次数的列索引

In [16]: np.arange(3).repeat(2)
Out[16]: array([0, 0, 1, 1, 2, 2])

Answer 2

这是一种基于对输入数组进行零填充的有效方法。每个代码步骤的内联注释必须更清楚地说明它是如何实现所需输出的。这是代码 -

# Arrange groups of d number of elements from the input array into 
# rows of a 2D array and pad with k*d zeros in each row. 
# Thus, the shape of this 2D array would be (k,d+k*d)
A_zeroappend = np.zeros((k,(k+1)*d))
A_zeroappend[:,:d] = A.reshape(-1,d)

# Get rid of the last row of appended zeros.
# Reshape and transpose to desired output shape (k*d,k) 
out = A_zeroappend.ravel()[:k*k*d].reshape(-1,k*d).T

运行时测试

这是一个快速运行时测试，比较了所提出的方法和 -

中列出的基于 np.repeat 的方法

In [292]: k = 800
     ...: d = 800
     ...: A = np.random.randint(2,9,(1,k*d))
     ...: 

In [293]: %%timeit
     ...: B = np.zeros((k*d,k))
     ...: B[np.arange(k*d),np.arange(k).repeat(d)]=A
     ...: 
1 loops, best of 3: 342 ms per loop

In [294]: %%timeit
     ...: A_zeroappend = np.zeros((k,(k+1)*d))
     ...: A_zeroappend[:,:d] = A.reshape(-1,d)
     ...: out = A_zeroappend.ravel()[:k*k*d].reshape(-1,k*d).T
     ...: 
100 loops, best of 3: 3.07 ms per loop

似乎提议的方法非常快！

更改没有循环的 numpy 数组的格式

change the format of a numpy array with no loops

python

numpy

matrix

vectorization