为什么 numpy 混合基本/高级索引取决于切片邻接？

Question

我知道之前有人问过类似的问题 ()，但是据我所知没有人回答我的具体问题...

我的问题是关于描述的 numpy 混合高级/基本索引 here:

... Two cases of index combination need to be distinguished:

The advanced indexes are separated by a slice, ellipsis or newaxis. For example x[arr1,:,arr2].

The advanced indexes are all next to each other. For example x[...,arr1,arr2,:] but not x[arr1,:,1] since 1 is an advanced index in this regard.

In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).

为什么需要这种区分？

我期望在所有情况下都使用针对案例 2 描述的行为。为什么索引是否相邻很重要？

我知道在某些情况下您可能想要案例 1 的行为；例如，"vectorization" 的索引结果沿新维度。但是这种行为可以并且应该由用户定义。也就是说，如果情况 2 的行为是默认的，则情况 1 的行为可能仅使用： x[arr1,:,arr2].reshape((len(arr1),x.shape[1]))

我知道您可以使用 np.ix_() 实现案例 2 中描述的行为，但在我看来，默认索引行为中的这种不一致是出乎意料且不合理的。有人可以证明吗？

谢谢，

Answer 1

情况 2 的行为对于情况 1 的定义不明确。您可能在以下句子中遗漏了一个微妙之处：

In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array

您可能想象输入和输出维度之间存在一对一的对应关系，这可能是因为您想象的是 Matlab 样式的索引。 NumPy 不是那样工作的。如果你有四个具有以下形状的数组：

a.shape == (2, 3, 4, 5, 6)
b.shape == (20, 30)
c.shape == (20, 30)
d.shape == (20, 30)

然后 a[b, :, c, :, d] 有四个维度，长度为 3、5、20 和 30。没有明确的地方可以放置 20 和 30。NumPy默认将它们贴在前面。

另一方面，使用 a[:, b, c, d, :]，20 和 30 可以转到 3、4 和 5 所在的位置，因为 3、4 和 5 彼此相邻。新维度的整块位于原始维度的整块所在的位置，只有当原始维度在原形。

为什么 numpy 混合基本/高级索引取决于切片邻接？

Why does numpy mixed basic / advanced indexing depend on slice adjacency?

python

numpy

multidimensional-array

numpy-ndarray

为什么需要这种区分？