使用 numpy `as_strided` 函数创建任意维度的补丁、平铺、滚动或滑动 windows

Using numpy `as_strided` function to create patches, tiles, rolling or sliding windows of arbitrary dimension

今天早上花了一些时间寻找一个通用的问题来指出重复问题,以解决关于 as_strided and/or . There seem to be a lot of 如何(安全地)创建补丁、滑动 windows、滚动 windows 的问题用于机器学习、卷积、图像处理 and/or 数值积分的数组的图块或视图。

我正在寻找可以接受 windowstepaxis 参数以及 return 一个 as_strided 视图的通用函数任意维度。我将在下面给出我的答案,但我很想知道是否有人可以制定更有效的方法,因为我不确定使用 np.squeeze() 是最好的方法,我不确定我的 assert 语句使函数足够安全以写入结果视图,我不确定如何处理 axis 未按升序排列的边缘情况。

尽职调查

我能找到的最通用的函数是 sklearn.feature_extraction.image.extract_patches written by @eickenberg (as well as the apparently equivalent skimage.util.view_as_windows), but those are not well documented on the net, and can't do windows over fewer axes than there are in the original array (for example, 要求一个特定大小的 window 仅在一个轴上)。通常问题只需要 numpy 的答案。

@Divakar 为一维输入 , but higher-dimension inputs require a bit more care. I've made a bare bones 创建了一个广义 numpy 函数,但它的可扩展性不是很好。

2020 年 1 月编辑:将可迭代对象 return 从列表更改为生成器以节省内存。

编辑 2020 年 10 月:将生成器放在一个单独的函数中,因为混合生成器和 return 语句不能直观地工作。

这是我目前的食谱:

def window_nd(a, window, steps = None, axis = None, gen_data = False):
        """
        Create a windowed view over `n`-dimensional input that uses an 
        `m`-dimensional window, with `m <= n`
        
        Parameters
        -------------
        a : Array-like
            The array to create the view on
            
        window : tuple or int
            If int, the size of the window in `axis`, or in all dimensions if 
            `axis == None`
            
            If tuple, the shape of the desired window.  `window.size` must be:
                equal to `len(axis)` if `axis != None`, else 
                equal to `len(a.shape)`, or 
                1
                
        steps : tuple, int or None
            The offset between consecutive windows in desired dimension
            If None, offset is one in all dimensions
            If int, the offset for all windows over `axis`
            If tuple, the steps along each `axis`.  
                `len(steps)` must me equal to `len(axis)`
    
        axis : tuple, int or None
            The axes over which to apply the window
            If None, apply over all dimensions
            if tuple or int, the dimensions over which to apply the window

        gen_data : boolean
            returns data needed for a generator
    
        Returns
        -------
        
        a_view : ndarray
            A windowed view on the input array `a`, or `a, wshp`, where `whsp` is the window shape needed for creating the generator
            
        """
        ashp = np.array(a.shape)
        
        if axis != None:
            axs = np.array(axis, ndmin = 1)
            assert np.all(np.in1d(axs, np.arange(ashp.size))), "Axes out of range"
        else:
            axs = np.arange(ashp.size)
            
        window = np.array(window, ndmin = 1)
        assert (window.size == axs.size) | (window.size == 1), "Window dims and axes don't match"
        wshp = ashp.copy()
        wshp[axs] = window
        assert np.all(wshp <= ashp), "Window is bigger than input array in axes"
        
        stp = np.ones_like(ashp)
        if steps:
            steps = np.array(steps, ndmin = 1)
            assert np.all(steps > 0), "Only positive steps allowed"
            assert (steps.size == axs.size) | (steps.size == 1), "Steps and axes don't match"
            stp[axs] = steps
    
        astr = np.array(a.strides)
        
        shape = tuple((ashp - wshp) // stp + 1) + tuple(wshp)
        strides = tuple(astr * stp) + tuple(astr)
        
        as_strided = np.lib.stride_tricks.as_strided
        a_view = np.squeeze(as_strided(a, 
                                     shape = shape, 
                                     strides = strides))
        if gen_data :
            return a_view, shape[:-wshp.size]
        else:
            return a_view

def window_gen(a, window, **kwargs):
    #Same docstring as above, returns a generator
    _ = kwargs.pop(gen_data, False)
    a_view, shp = window_nd(a, window, gen_data  = True, **kwargs)
    for idx in np.ndindex(shp):
        yield a_view[idx]

部分测试用例:

a = np.arange(1000).reshape(10,10,10)

window_nd(a, 4).shape # sliding (4x4x4) window
Out: (7, 7, 7, 4, 4, 4)

window_nd(a, 2, 2).shape # (2x2x2) blocks
Out: (5, 5, 5, 2, 2, 2)

window_nd(a, 2, 1, 0).shape # sliding window of width 2 over axis 0
Out: (9, 2, 10, 10)

window_nd(a, 2, 2, (0,1)).shape # tiled (2x2) windows over first and second axes
Out: (5, 5, 2, 2, 10)

window_nd(a,(4,3,2)).shape  # arbitrary sliding window
Out: (7, 8, 9, 4, 3, 2)

window_nd(a,(4,3,2),(1,5,2),(0,2,1)).shape #arbitrary windows, steps and axis
Out: (7, 5, 2, 4, 2, 3) # note shape[-3:] != window as axes are out of order