如何获取包含另一个二维数组索引的二维数组

Question

问题

import numpy as np

我有一个数组，没有关于其内容的任何先验信息。例如：

ourarray = \
np.array([[0,1],
          [2,3],
          [4,5]])

我想获取可用于索引的数字对 ourarray。即我想得到：

array([[0, 0, 1, 1, 2, 2],
       [0, 1, 0, 1, 0, 1]])

(0,0、0,1、1,0等，ourarray所有可能的索引都在这个数组中。)

相似但不同的帖子

：这里他们在一个数组中搜索另一个数组，而不返回整个数组的索引。
：他们正在处理两个数组，objective 不是来创建第二个数组基于包含其索引的第一个

尝试 1（成功但效率低下）

我可以通过以下方式获得这个数组：

np.array(np.where(np.ones(ourarray.shape)))

它给出了想要的结果，但它需要创建 np.ones(ourarray.shape)，这似乎不是一种有效的方法。

尝试 2（失败）

我也试过：

np.array(np.where(ourarray))

这不起作用，因为没有为 ourarray 的 0 条目返回索引。

问题

尝试 1 有效，但我正在寻找更有效的方法。我怎样才能更有效地做到这一点？

Answer 1

您可以使用 numpy.argwhere 然后使用 .T 并得到您想要的。

试试这个：

>>> ourarray = np.array([[0,1],[2,3], [4,5]])
>>> np.argwhere(ourarray>=0).T
array([[0, 0, 1, 1, 2, 2],
       [0, 1, 0, 1, 0, 1]])

如果你的数组中可能存在任何值，你可以使用这个：

ourarray = np.array([[np.nan,1],[2,np.inf], [-4,-5]])
np.argwhere(np.ones(ourarray.shape)==1).T
# array([[0, 0, 1, 1, 2, 2],
#        [0, 1, 0, 1, 0, 1]])

Answer 2

你打算如何使用这个索引？

nonzero (where) 生成的元组是为方便索引而设计的：

In [54]: idx = np.nonzero(np.ones_like(ourarray))
In [55]: idx
Out[55]: (array([0, 0, 1, 1, 2, 2]), array([0, 1, 0, 1, 0, 1]))
In [56]: ourarray[idx]
Out[56]: array([0, 1, 2, 3, 4, 5])

或等效地显式使用 2 个数组：

In [57]: ourarray[idx[0], idx[1]]
Out[57]: array([0, 1, 2, 3, 4, 5])

您的 np.array(idx) 可以像 [57] 中那样使用，但不能像 [56] 中那样使用。在 [56] 中使用 tuple 很重要。

如果我们将 transpose 应用于此，我们将得到一个数组。

In [58]: tidx = np.transpose(idx)
In [59]: tidx
Out[59]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1],
       [2, 0],
       [2, 1]])

要将其用于索引，我们必须迭代：

In [60]: [ourarray[i,j] for i,j in tidx]
Out[60]: [0, 1, 2, 3, 4, 5]

argwhere 另一个答案中提出的只是转置。使用 outarray>=0 实际上与 np.ones 表达式没有什么不同。两者都为所有元素创建一个 True/1 的数组。

In [61]: np.argwhere(np.ones_like(ourarray))
Out[61]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1],
       [2, 0],
       [2, 1]])

还有其他生成索引的方法，np.indices、np.meshgrid、np.mgrid、np.ndindex，但它们需要某种形式的重塑 and/or转置以获得您想要的：

In [71]: np.indices(ourarray.shape)
Out[71]: 
array([[[0, 0],
        [1, 1],
        [2, 2]],

       [[0, 1],
        [0, 1],
        [0, 1]]])
In [72]: np.indices(ourarray.shape).reshape(2,6)
Out[72]: 
array([[0, 0, 1, 1, 2, 2],
       [0, 1, 0, 1, 0, 1]])

计时

如果ourarray>=0有效，它比np.ones快：

In [79]: timeit np.ones_like(ourarray)
6.22 µs ± 11.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [80]: timeit ourarray>=0
1.43 µs ± 15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

np.where/nonzero 增加了一个重要的时间：

In [81]: timeit np.nonzero(ourarray>=0)
6.43 µs ± 8.15 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

还有一点时间将元组转换为数组：

In [82]: timeit np.array(np.nonzero(ourarray>=0))
10.4 µs ± 35.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

argwhere 的 transpose 往返增加了更多时间：

In [83]: timeit np.argwhere(ourarray>=0).T
16.9 µs ± 35.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

indices 与 [82] 大致相同，但缩放比例可能不同。

In [84]: timeit np.indices(ourarray.shape).reshape(2,-1)
10.9 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

如何获取包含另一个二维数组索引的二维数组

How to get a 2D array containing indices of another 2D array

python

indexing

numpy

numpy-ndarray

问题

相似但不同的帖子

尝试 1（成功但效率低下）

尝试 2（失败）

问题

计时