Numpy - 从数组中切片二维行或列向量
Numpy - slicing 2d row or column vector from array
我正在尝试找到一个巧妙的小技巧,用于从二维数组中切割 row/column 并获得 (col_size x 1)
或 (1 x row_size)
的数组。
有没有比在每次切片后使用 numpy.reshape()
更简单的方法?
干杯,
斯蒂芬
您可以在一次操作中切片并插入新轴。例如,这是一个二维数组:
>>> a = np.arange(1, 7).reshape(2, 3)
>>> a
array([[1, 2, 3],
[4, 5, 6]])
要切出单个 列(返回形状数组 (2, 1)
),以 None
作为第三维进行切片:
>>> a[:, 1, None]
array([[2],
[5]])
要切出单个 行(返回形状为 (1, 3)
的数组),以 None
作为第二维进行切片:
>>> a[0, None, :]
array([[1, 2, 3]])
将索引设为切片、列表或数组
X[[0],:]
X[0:1,4]
但是 reshape
除了需要输入之外没有任何问题。它并不慢。 [None,:]
是一个很好的缩写。
列表索引的使用可能是最短的,但它确实会产生一个副本(加号或减号?)并且速度较慢
对于(100,100)
整数数组:
In [487]: timeit x[[50],:]
100000 loops, best of 3: 10.3 µs per loop # slowest
In [488]: timeit x[50:51,:]
100000 loops, best of 3: 2.24 µs per loop # slice indexing is fast
In [489]: timeit x[50,:].reshape(1,-1)
100000 loops, best of 3: 3.29 µs per loop # minimal time penalty
In [490]: timeit x[50,:][None,:]
100000 loops, best of 3: 3.55 µs per loop
In [543]: timeit x[None,50,:] # **best**
1000000 loops, best of 3: 1.76 µs per loop
复制的一项测试是将数据缓冲区指针与原始数据进行比较。
In [492]: x.__array_interface__['data']
Out[492]: (175920456, False)
In [493]: x[50,:].__array_interface__['data']
Out[493]: (175940456, False)
In [494]: x[[50],:].__array_interface__['data']
Out[494]: (175871672, False) # different pointer
In [495]: x[50:51,:].__array_interface__['data']
Out[495]: (175940456, False)
In [496]: x[50,:][None,:].__array_interface__['data']
Out[496]: (175940456, False)
这个又好又简单的方法怎么样?
In [73]: arr = (np.arange(5, 25)).reshape(5, 4)
In [74]: arr
Out[74]:
array([[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24]])
# extract column 1 as a column vector
In [79]: col1 = arr[:, [0]]
In [80]: col1.shape
Out[80]: (5, 1)
In [81]: col1
Out[81]:
array([[ 5],
[ 9],
[13],
[17],
[21]])
# extract row 1 as a row vector
In [82]: row1 = arr[[0], :]
In [83]: row1.shape
Out[83]: (1, 4)
In [84]: row1
Out[84]: array([[5, 6, 7, 8]])
我正在尝试找到一个巧妙的小技巧,用于从二维数组中切割 row/column 并获得 (col_size x 1)
或 (1 x row_size)
的数组。
有没有比在每次切片后使用 numpy.reshape()
更简单的方法?
干杯, 斯蒂芬
您可以在一次操作中切片并插入新轴。例如,这是一个二维数组:
>>> a = np.arange(1, 7).reshape(2, 3)
>>> a
array([[1, 2, 3],
[4, 5, 6]])
要切出单个 列(返回形状数组 (2, 1)
),以 None
作为第三维进行切片:
>>> a[:, 1, None]
array([[2],
[5]])
要切出单个 行(返回形状为 (1, 3)
的数组),以 None
作为第二维进行切片:
>>> a[0, None, :]
array([[1, 2, 3]])
将索引设为切片、列表或数组
X[[0],:]
X[0:1,4]
但是 reshape
除了需要输入之外没有任何问题。它并不慢。 [None,:]
是一个很好的缩写。
列表索引的使用可能是最短的,但它确实会产生一个副本(加号或减号?)并且速度较慢
对于(100,100)
整数数组:
In [487]: timeit x[[50],:]
100000 loops, best of 3: 10.3 µs per loop # slowest
In [488]: timeit x[50:51,:]
100000 loops, best of 3: 2.24 µs per loop # slice indexing is fast
In [489]: timeit x[50,:].reshape(1,-1)
100000 loops, best of 3: 3.29 µs per loop # minimal time penalty
In [490]: timeit x[50,:][None,:]
100000 loops, best of 3: 3.55 µs per loop
In [543]: timeit x[None,50,:] # **best**
1000000 loops, best of 3: 1.76 µs per loop
复制的一项测试是将数据缓冲区指针与原始数据进行比较。
In [492]: x.__array_interface__['data']
Out[492]: (175920456, False)
In [493]: x[50,:].__array_interface__['data']
Out[493]: (175940456, False)
In [494]: x[[50],:].__array_interface__['data']
Out[494]: (175871672, False) # different pointer
In [495]: x[50:51,:].__array_interface__['data']
Out[495]: (175940456, False)
In [496]: x[50,:][None,:].__array_interface__['data']
Out[496]: (175940456, False)
这个又好又简单的方法怎么样?
In [73]: arr = (np.arange(5, 25)).reshape(5, 4)
In [74]: arr
Out[74]:
array([[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24]])
# extract column 1 as a column vector
In [79]: col1 = arr[:, [0]]
In [80]: col1.shape
Out[80]: (5, 1)
In [81]: col1
Out[81]:
array([[ 5],
[ 9],
[13],
[17],
[21]])
# extract row 1 as a row vector
In [82]: row1 = arr[[0], :]
In [83]: row1.shape
Out[83]: (1, 4)
In [84]: row1
Out[84]: array([[5, 6, 7, 8]])