为什么维度的顺序会随着布尔索引而改变？

Question

当我们有形状为 (a, b, c) 的 M 和用于索引最后一个数组的索引数组 v 时，为什么 M[i, :, v] 会生成一个数组形状 (d, b)（d v 中真值的数量）？如下图：

In [409]: M = zeros((100, 20, 40))

In [410]: val = ones(shape=(40,), dtype="bool")

In [411]: M[0, :, :].shape
Out[411]: (20, 40)  # As expected

In [412]: M[0, :, val].shape
Out[412]: (40, 20)  # Huh?  Why (40, 20), not (20, 40)?

In [413]: M[:, :, val].shape
Out[413]: (100, 20, 40)  # s expected again

为什么 M[0, :, val] 的形状是 (40, 20) 而不是 (20, 40)？

Answer 1

根据文档的 boolean indexing 部分 http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#boolean-array-indexing

Combining multiple Boolean indexing arrays or a Boolean with an integer indexing array can best be understood with the obj.nonzero() analogy.

ind = np.nonzero(val)[0]
# array([ 0,  1,  2, ...., 39], dtype=int32)
M[0, :, ind].shape   # (40,20)

现在我们转到关于结合高级索引和基本索引的部分 http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing

这是一个表格案例：x[arr1, :, arr2]

in the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that.

因此 0 和 ind 部分产生 (40,) 选择，而中间的 : 产生 (20,)。通过将 : 部分放在最后，结果维度为 (40,20)。基本上它这样做是因为这种索引样式存在歧义，所以它始终选择将切片部分放在最后。

选择这些值并保持所需形状（或多或少）的一种方法是使用 np.ix_ 生成索引元组。

M[np.ix_([0],np.arange(20),ind)].shape # (1, 20, 40)

您可以使用 np.squeeze 删除初始 1 维度。

ix_ 的这种用法在 'purely index array indexing' 部分的末尾进行了说明 http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#purely-integer-array-indexing

为什么维度的顺序会随着布尔索引而改变？

Why does the order of dimensions change with boolean indexing?

python

arrays

indexing

boolean

numpy