ndarray 的基于数组的索引

Question

我不明白numpy.take虽然这似乎是我想要的功能。我有一个 ndarray，我想使用另一个 ndarray 来索引第一个。

import numpy as np

# Create a matrix
A = np.arange(75).reshape((5,5,3))

# Create the index array
idx = np.array([[1, 0, 0, 1, 1],
                [1, 1, 0, 1, 1],
                [1, 0, 1, 0, 1],
                [1, 1, 0, 0, 0],
                [1, 1, 1, 1, 0]])

鉴于上述情况，我想通过 idx 中的值索引 A。我以为 take 会这样做，但它没有输出我预期的结果。

# Index the 3rd dimension of the A matrix by the idx array.

Asub = np.take(A, idx)

print(f'Value in A at 1,1,1 is {A[1,1,1]}')
print(f'Desired index from idx {idx[1,1]}')

print(f'Value in Asub at [1,1,1] {Asub[1,1]} <- thought this would be 19')

我希望看到 idx 位置的值，其中一个是 A 中的值，基于 idx:

Value in A at 1,1,1 is 19
Desired index from idx 1
Value in Asub at [1,1,1] 1 <- thought this would be 19

Answer 1

一种可能性是创建 broadcast 具有第三维的行和列索引，即与 (5,5) idx 配对的 (5,1) 和 (5,)：

In [132]: A[np.arange(5)[:,None],np.arange(5), idx]
Out[132]: 
array([[ 1,  3,  6, 10, 13],
       [16, 19, 21, 25, 28],
       [31, 33, 37, 39, 43],
       [46, 49, 51, 54, 57],
       [61, 64, 67, 70, 72]])

这最终会从 A[:,:,0] 和 A[:,:,1] 中选取值。这将 idx 的值作为整数，在有效 (0,1,2) 范围内（对于形状 3）。它们不是布尔选择器。

Out[132][1,1]为19，等同于A[1,1,1]； Out[132][1,2] 等同于 A[1,2,0].

take_along_axis 得到相同的值，但增加了维度：

In [142]: np.take_along_axis(A, idx[:,:,None], 2).shape
Out[142]: (5, 5, 1)

In [143]: np.take_along_axis(A, idx[:,:,None], 2)[:,:,0]
Out[143]: 
array([[ 1,  3,  6, 10, 13],
       [16, 19, 21, 25, 28],
       [31, 33, 37, 39, 43],
       [46, 49, 51, 54, 57],
       [61, 64, 67, 70, 72]])

迭代等效项可能更容易理解：

In [145]: np.array([[A[i,j,idx[i,j]] for j in range(5)] for i in range(5)])
Out[145]: 
array([[ 1,  3,  6, 10, 13],
       [16, 19, 21, 25, 28],
       [31, 33, 37, 39, 43],
       [46, 49, 51, 54, 57],
       [61, 64, 67, 70, 72]])

如果您在用“矢量化”数组方式表达动作时遇到困难，请继续编写集成版本。这样会避免很多歧义和误解。

另一种获得相同值的方法，将 idx 值视为 True/False 布尔值是：

In [146]: np.where(idx, A[:,:,1], A[:,:,0])
Out[146]: 
array([[ 1,  3,  6, 10, 13],
       [16, 19, 21, 25, 28],
       [31, 33, 37, 39, 43],
       [46, 49, 51, 54, 57],
       [61, 64, 67, 70, 72]])

Answer 2

IIUC，你可以通过广播idx数组得到结果数组，使其形状与A相同被乘，然后索引得到列1为：

Asub = (A * idx[:, :,  None])[:, :, 1]    # --> Asub[1, 1] = 19

# [[ 1  0  0 10 13]
#  [16 19  0 25 28]
#  [31  0 37  0 43]
#  [46 49  0  0  0]
#  [61 64 67 70  0]]

我认为这是最快的方法（或最好的方法之一），尤其是对于大型数组。

ndarray 的基于数组的索引

Array based indexing of an ndarray

python

arrays

numpy