Python2.7：在多维 Numpy 数组中遍历一维纤维

Question

我正在寻找一种方法来沿 3+ 维数组中的任何维度遍历 1D 纤维（行、列和多维等价物）。

在 2D 数组中，这是相当微不足道的，因为纤维是行和列，所以只需说 for row in A 即可完成工作。但是对于 3D 数组，此表达式迭代 2D 切片，而不是 1D 纤维。

下面是一个可行的解决方案：

import numpy as np
A = np.arange(27).reshape((3,3,3))
func = np.sum
for fiber_index in np.ndindex(A.shape[:-1]):
    print func(A[fiber_index])

但是，我想知道是否有这样的东西：

更地道
更快

希望能帮到你！

Answer 1

我想你可能正在寻找 numpy.apply_along_axis

In [10]: def my_func(x):
   ...:     return x**2 + x

In [11]: np.apply_along_axis(my_func, 2, A)
Out[11]: 
array([[[  0,   2,   6],
        [ 12,  20,  30],
        [ 42,  56,  72]],

       [[ 90, 110, 132],
        [156, 182, 210],
        [240, 272, 306]],

       [[342, 380, 420],
        [462, 506, 552],
        [600, 650, 702]]])

尽管许多 NumPy 函数（包括 sum）都有自己的 axis 参数来指定要使用的轴：

In [12]: np.sum(A, axis=2)
Out[12]: 
array([[ 3, 12, 21],
       [30, 39, 48],
       [57, 66, 75]])

Answer 2

numpy 提供了多种在 1 个或多个维度上循环的不同方法。

你的例子：

func = np.sum
for fiber_index in np.ndindex(A.shape[:-1]):
    print func(fiber_index)
    print A[fiber_index]

产生如下内容：

(0, 0)
[0 1 2]
(0, 1)
[3 4 5]
(0, 2)
[6 7 8]
...

在 1st 2 dim 上生成所有索引组合，为您的函数提供最后一个 1D 纤维。

查看 ndindex 的代码。这很有启发性。我试图在中提取它的精华。

它使用 as_strided 生成一个虚拟矩阵，nditer 对其进行迭代。它使用 'multi_index' 模式生成索引集，而不是该虚拟元素的元素。迭代本身是用 __next__ 方法完成的。这与当前在 numpy 编译代码中使用的索引样式相同。

http://docs.scipy.org/doc/numpy-dev/reference/arrays.nditer.html Iterating Over Arrays 有很好的解释，包括在 cython.

中这样做的例子

许多函数，其中有sum、max、product，让你指定你想要遍历哪个轴（axes）。您的示例 sum 可以写成：

np.sum(A, axis=-1)
np.sum(A, axis=(1,2))   # sum over 2 axes

等价于

np.add.reduce(A, axis=-1)

np.add是一个ufunc，reduce指定了一个迭代模式。还有许多其他 ufunc 和其他迭代模式 - accumulate、reduceat。您还可以定义自己的 ufunc.

xnx建议

np.apply_along_axis(np.sum, 2, A)

值得深入研究 apply_along_axis，看看它是如何逐步跨越 A 的维度的。在您的示例中，它在 while 循环中遍历所有可能的 i,j，计算：

outarr[(i,j)] = np.sum(A[(i, j, slice(None))])

在索引元组中包含 slice 个对象是一个不错的技巧。请注意，它编辑列表，然后将其转换为元组以进行索引。那是因为元组是不可变的。

您的迭代可以沿任何轴应用，方法是将该轴滚动到末端。这是一个 'cheap' 操作，因为它只是改变步幅。

def with_ndindex(A, func, ax=-1):
    # apply func along axis ax
    A = np.rollaxis(A, ax, A.ndim) # roll ax to end (changes strides)
    shape = A.shape[:-1]
    B = np.empty(shape,dtype=A.dtype)
    for ii in np.ndindex(shape):
        B[ii] = func(A[ii])
    return B

我对 3x3x3、10x10x10 和 100x100x100 A 数组做了一些计时。这种 np.ndindex 方法始终比 apply_along_axis 方法快三分之一。直接使用np.sum(A, -1)会快很多。

因此，如果 func 仅限于在一维光纤上运行（与 sum 不同），那么 ndindex 方法是一个不错的选择。

Python2.7：在多维 Numpy 数组中遍历一维纤维

Python 2.7: looping over 1D fibers in a multidimensional Numpy array

arrays

iteration

numpy

multidimensional-array

python-2.7