Numpy 在二维数组上滚动 window，作为具有嵌套数组作为数据值的一维数组

Question

使用np.lib.stride_tricks.as_strided时，如何管理以嵌套数组作为数据值的二维数组？有更好的高效方法吗？

具体来说，如果我有一个二维 np.array 如下所示，其中一维数组中的每个数据项都是长度为 2 的数组：

[[1., 2.],[3., 4.],[5.,6.],[7.,8.],[9.,10.]...]

我想翻身重塑如下：

[[[1., 2.],[3., 4.],[5.,6.]],
 [[3., 4.],[5.,6.],[7.,8.]],
 [[5.,6.],[7.,8.],[9.,10.]],
  ...
]

我看过类似的答案（例如 this rolling window function），但是在使用中我不能让内部 array/tuples 保持原样。

例如 window 长度为 3：我尝试了 (len(seq)+3-1, 3, 2) 的 shape 和 (2 * 8, 2 * 8, 8) 的 stride ，但没有运气。也许我遗漏了一些明显的东西？

干杯。

编辑： 使用 Python 内置函数很容易生成功能相同的解决方案（可以使用 np.arange 类似于 Divakar 的解决方案），但是，使用 as_strided 呢？据我了解，这可用于高效解决方案？

Answer 1

IIUC 你可以这样做 -

def rolling_window2D(a,n):
    # a: 2D Input array 
    # n: Group/sliding window length
    return a[np.arange(a.shape[0]-n+1)[:,None] + np.arange(n)]

样本运行-

In [110]: a
Out[110]: 
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [111]: rolling_window2D(a,3)
Out[111]: 
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 3,  4],
        [ 5,  6],
        [ 7,  8]],

       [[ 5,  6],
        [ 7,  8],
        [ 9, 10]]])

Answer 2

您的 as_strided 试用有什么问题？它对我有用。

In [28]: x=np.arange(1,11.).reshape(5,2)
In [29]: x.shape
Out[29]: (5, 2)
In [30]: x.strides
Out[30]: (16, 8)
In [31]: np.lib.stride_tricks.as_strided(x,shape=(3,3,2),strides=(16,16,8))
Out[31]: 
array([[[  1.,   2.],
        [  3.,   4.],
        [  5.,   6.]],

       [[  3.,   4.],
        [  5.,   6.],
        [  7.,   8.]],

       [[  5.,   6.],
        [  7.,   8.],
        [  9.,  10.]]])

在我的第一次编辑中，我使用了一个 int 数组，所以不得不使用 (8,8,4) 作为步幅。

您的形状可能有误。如果太大，它开始看到数据缓冲区末尾的值。

   [[  7.00000000e+000,   8.00000000e+000],
    [  9.00000000e+000,   1.00000000e+001],
    [  8.19968827e-257,   5.30498948e-313]]])

这里只是改变了显示方式，7, 8, 9, 10还在。编写那些插槽可能很危险，会弄乱代码的其他部分。 as_strided 最好用于只读目的。 Writes/sets 比较棘手。

Answer 3

您的任务与 this one 相似。所以我稍微改了一下。

# Rolling window for 2D arrays in NumPy
import numpy as np

def rolling_window(a, shape):  # rolling window for 2D array
    s = (a.shape[0] - shape[0] + 1,) + (a.shape[1] - shape[1] + 1,) + shape
    strides = a.strides + a.strides
    return np.lib.stride_tricks.as_strided(a, shape=s, strides=strides)

x = np.array([[1,2],[3,4],[5,6],[7,8],[9,10],[3,4],[5,6],[7,8],[11,12]])
y = np.array([[3,4],[5,6],[7,8]])
found = np.all(np.all(rolling_window(x, y.shape) == y, axis=2), axis=2)
print(found.nonzero()[0])

Numpy 在二维数组上滚动 window，作为具有嵌套数组作为数据值的一维数组

Numpy rolling window over 2D array, as a 1D array with nested array as data values

python

arrays

performance

numpy

sliding-window