从 np.void 数组中提取一个 ndarray

Question

我用的npy文件⬆️ https://github.com/mangomangomango0820/DataAnalysis/blob/master/NumPy/NumPyEx/NumPy_Ex1_3Dscatterplt.npy

2。加载npy文件后，

data = np.load('NumPy_Ex1_3Dscatterplt.npy')
'''
[([   2,    2, 1920,  480],) ([   1,    3, 1923,  480],)
 ......
 ([   3,    3, 1923,  480],)]
 
 
⬆️ data.shape, (69,)
⬆️ data.shape, (69,)
⬆️ data.dtype, [('f0', '<i8', (4,))]
⬆️ type(data), <class 'numpy.ndarray'>
⬆️ type(data[0]), <class 'numpy.void'>
'''

您可以看到每一行，例如data[0]，其类型为<class 'numpy.void'>

我希望根据上面的数据得到一个ndarray，看起来像这样⬇️

[[   2    2 1920  480]
...
 [   3    3 1923  480]]

我的做法是⬇️

all = np.array([data[i][0] for i in range(data.shape[0])])

'''
[[   2    2 1920  480]
...
 [   3    3 1923  480]]
'''

我想知道是否有更智能的方法来处理 numpy.void class 数据并达到预期的结果。

Answer 1

技巧在这里

data_clean = np.array(data.tolist())
print(data_clean)
print(data_clean.shape)

输出

[[[   2    2 1920  480]]

...............

 [[   3    3 1923  480]]]
(69, 1, 4)

如果你不喜欢中间多出的1维，可以这样挤

data_sqz = data_clean.squeeze()
print(data_sqz)
print(data_sqz.shape)

输出

...
 [   3    3 1923  480]]
(69, 4)

Answer 2

您的 data 是 structured array，compound dtype。

https://numpy.org/doc/stable/user/basics.rec.html

我可以重新创建它：

In [130]: dt = np.dtype([("f0", "<i8", (4,))])
In [131]: x = np.array(
     ...:     [([2, 2, 1920, 480],), ([1, 3, 1923, 480],), ([3, 3, 1923, 480],)], dtype=dt
     ...: )
In [132]: x
Out[132]: 
array([([   2,    2, 1920,  480],), ([   1,    3, 1923,  480],),
       ([   3,    3, 1923,  480],)], dtype=[('f0', '<i8', (4,))])

这是一维数组 onr 字段，字段本身包含 4 个元素。

字段按名称访问：

In [133]: x["f0"]
Out[133]: 
array([[   2,    2, 1920,  480],
       [   1,    3, 1923,  480],
       [   3,    3, 1923,  480]])

这有形状为 (3,4) 的整数数据类型。

按名称访问字段也适用于更复杂的结构化数组。

使用来自其他答案的 tolist 方法：

In [134]: x.tolist()
Out[134]: 
[(array([   2,    2, 1920,  480]),),
 (array([   1,    3, 1923,  480]),),
 (array([   3,    3, 1923,  480]),)]

In [135]: np.array(x.tolist())           # (3,1,4) shape
Out[135]: 
array([[[   2,    2, 1920,  480]],

       [[   1,    3, 1923,  480]],

       [[   3,    3, 1923,  480]]])
In [136]: np.vstack(x.tolist())          # (3,4) shape
Out[136]: 
array([[   2,    2, 1920,  480],
       [   1,    3, 1923,  480],
       [   3,    3, 1923,  480]])

文档还建议使用：

In [137]: import numpy.lib.recfunctions as rf
In [138]: rf.structured_to_unstructured(x)
Out[138]: 
array([[   2,    2, 1920,  480],
       [   1,    3, 1923,  480],
       [   3,    3, 1923,  480]])

结构化数组的元素显示为元组，尽管类型是泛型 np.void

有一个较旧的 class recarray，它很相似，但增加了访问字段的方式

In [146]: y=x.view(np.recarray)
In [147]: y
Out[147]: 
rec.array([([   2,    2, 1920,  480],), ([   1,    3, 1923,  480],),
           ([   3,    3, 1923,  480],)],
          dtype=[('f0', '<i8', (4,))])
In [148]: y.f0
Out[148]: 
array([[   2,    2, 1920,  480],
       [   1,    3, 1923,  480],
       [   3,    3, 1923,  480]])
In [149]: type(y[0])
Out[149]: numpy.record

我经常将结构化数组的元素称为记录。

从 np.void 数组中提取一个 ndarray

Extract an ndarray from a np.void array

python

numpy

numpy-ndarray