当我们重塑 numpy 数组时，如何推断步幅大小？

Question

我有一个 1x1024 一维数组（扁平图像）。要查看图像，我想将其大小重塑为 32x32。

我可以通过 x.reshape(-1,32) 轻松实现这一目标，并且它按我的预期工作。它不会破坏图像。它每次读取宽度为 32 的一维数组。

说这一次，有 4 张图像，大小为 32 x 8。重塑它的安全方法是什么？步幅定义背后的逻辑是什么？它总是从最大的维度开始（比如，3d->2d->1d）吗？好像是..

In [2]: a = np.arange(1024)


In [3]: a.reshape(4,32,8)
Out[3]:
array([[[   0,    1,    2, ...,    5,    6,    7],
        [   8,    9,   10, ...,   13,   14,   15],
        [  16,   17,   18, ...,   21,   22,   23],
        ...,
        [ 232,  233,  234, ...,  237,  238,  239],
        [ 240,  241,  242, ...,  245,  246,  247],
        [ 248,  249,  250, ...,  253,  254,  255]],

       [[ 256,  257,  258, ...,  261,  262,  263],
        [ 264,  265,  266, ...,  269,  270,  271],
        [ 272,  273,  274, ...,  277,  278,  279],
        ...,
        [ 488,  489,  490, ...,  493,  494,  495],
        [ 496,  497,  498, ...,  501,  502,  503],
        [ 504,  505,  506, ...,  509,  510,  511]],

       [[ 512,  513,  514, ...,  517,  518,  519],
        [ 520,  521,  522, ...,  525,  526,  527],
        [ 528,  529,  530, ...,  533,  534,  535],
        ...,
        [ 744,  745,  746, ...,  749,  750,  751],
        [ 752,  753,  754, ...,  757,  758,  759],
        [ 760,  761,  762, ...,  765,  766,  767]],

       [[ 768,  769,  770, ...,  773,  774,  775],
        [ 776,  777,  778, ...,  781,  782,  783],
        [ 784,  785,  786, ...,  789,  790,  791],
        ...,
        [1000, 1001, 1002, ..., 1005, 1006, 1007],
        [1008, 1009, 1010, ..., 1013, 1014, 1015],
        [1016, 1017, 1018, ..., 1021, 1022, 1023]]])

In [4]: a.reshape(4,-1,8)
Out[4]:
array([[[   0,    1,    2, ...,    5,    6,    7],
        [   8,    9,   10, ...,   13,   14,   15],
        [  16,   17,   18, ...,   21,   22,   23],
        ...,
        [ 232,  233,  234, ...,  237,  238,  239],
        [ 240,  241,  242, ...,  245,  246,  247],
        [ 248,  249,  250, ...,  253,  254,  255]],

       [[ 256,  257,  258, ...,  261,  262,  263],
        [ 264,  265,  266, ...,  269,  270,  271],
        [ 272,  273,  274, ...,  277,  278,  279],
        ...,
        [ 488,  489,  490, ...,  493,  494,  495],
        [ 496,  497,  498, ...,  501,  502,  503],
        [ 504,  505,  506, ...,  509,  510,  511]],

       [[ 512,  513,  514, ...,  517,  518,  519],
        [ 520,  521,  522, ...,  525,  526,  527],
        [ 528,  529,  530, ...,  533,  534,  535],
        ...,
        [ 744,  745,  746, ...,  749,  750,  751],
        [ 752,  753,  754, ...,  757,  758,  759],
        [ 760,  761,  762, ...,  765,  766,  767]],

       [[ 768,  769,  770, ...,  773,  774,  775],
        [ 776,  777,  778, ...,  781,  782,  783],
        [ 784,  785,  786, ...,  789,  790,  791],
        ...,
        [1000, 1001, 1002, ..., 1005, 1006, 1007],
        [1008, 1009, 1010, ..., 1013, 1014, 1015],
        [1016, 1017, 1018, ..., 1021, 1022, 1023]]])

In [5]: a.reshape(4,8,32)
Out[5]:
array([[[   0,    1,    2, ...,   29,   30,   31],
        [  32,   33,   34, ...,   61,   62,   63],
        [  64,   65,   66, ...,   93,   94,   95],
        ...,
        [ 160,  161,  162, ...,  189,  190,  191],
        [ 192,  193,  194, ...,  221,  222,  223],
        [ 224,  225,  226, ...,  253,  254,  255]],

       [[ 256,  257,  258, ...,  285,  286,  287],
        [ 288,  289,  290, ...,  317,  318,  319],
        [ 320,  321,  322, ...,  349,  350,  351],
        ...,
        [ 416,  417,  418, ...,  445,  446,  447],
        [ 448,  449,  450, ...,  477,  478,  479],
        [ 480,  481,  482, ...,  509,  510,  511]],

       [[ 512,  513,  514, ...,  541,  542,  543],
        [ 544,  545,  546, ...,  573,  574,  575],
        [ 576,  577,  578, ...,  605,  606,  607],
        ...,
        [ 672,  673,  674, ...,  701,  702,  703],
        [ 704,  705,  706, ...,  733,  734,  735],
        [ 736,  737,  738, ...,  765,  766,  767]],

       [[ 768,  769,  770, ...,  797,  798,  799],
        [ 800,  801,  802, ...,  829,  830,  831],
        [ 832,  833,  834, ...,  861,  862,  863],
        ...,
        [ 928,  929,  930, ...,  957,  958,  959],
        [ 960,  961,  962, ...,  989,  990,  991],
        [ 992,  993,  994, ..., 1021, 1022, 1023]]])

Answer 1

reshape 不会对基础值重新排序。该数组存储为一维字节数组，加上 shape、strides 和 dtype 用于 view 它作为特定的多维数组。

可以看看strides属性：

In [513]: arr = np.arange(1024)                                                                      
In [514]: arr.shape, arr.strides                                                                     
Out[514]: ((1024,), (8,))
In [515]: arr1=arr.reshape(32,32);arr1.shape, arr1.strides                                           
Out[515]: ((32, 32), (256, 8))
In [516]: arr1=arr.reshape(4,32,8);arr1.shape, arr1.strides                                          
Out[516]: ((4, 32, 8), (2048, 64, 8))

对于 1d，它一次只步进 8 个字节（int64 的大小）

用2d，256=32*8；要遍历行，它必须步进 256 字节

3d, 2048 = 32 * 8 * 8;块之间的步骤。

为了好玩，看看转置：

In [517]: arr1=arr.reshape(4,32,8).T;arr1.shape, arr1.strides                                        
Out[517]: ((8, 32, 4), (8, 64, 2048))

形状反转了，步伐也变了。

很多时候在reshape一个图像数组成block的时候，我们需要reshape成小块，做partial transpose，reshape到一个目标。第一个重塑和转置创建一个视图，只是玩弄形状和步幅。但是最后reshape往往需要copy

当我们重塑 numpy 数组时，如何推断步幅大小？

when we reshape numpy array, how is the stride size inferred?

python

numpy

reshape

pytorch

tensor