1.16.0 中的结构化到非结构化 numpy 数组转换损坏
Broken structured to unstructured numpy array conversion in 1.16.0
我想将具有相同 (np.float) 列的 NumPy 结构化数组转换为 Numpy 1.16.0 中的非结构化数组。
之前我是这样做的:
array = np.ones((100,), dtype=[('user', np.object), ('item', np.float), ('value', np.float)])
array[['item','value']].view((np.float, 2))
在 1.16.0 中,structured_to_unstructured
函数出现在 numpy.lib.recfunctions
。
但是对于包含对象列的数组的视图,新 structured_to_unstructured
和旧视图方式都会抛出 TypeError:
Cannot change data-type for object array.
对于完全没有对象列的结构化数组的视图,它工作正常,但如果视图只有由包含对象字段的数组制成的数字列,就会崩溃。
在 1.16 中,多视场视图的处理发生了重大变化。您需要使用 rf.repack_fields
来获得更早的行为。
In [277]: import numpy.lib.recfunctions as rf
In [287]: arr = np.ones(3, dtype='O,f,f')
In [288]: arr
Out[288]:
array([(1, 1., 1.), (1, 1., 1.), (1, 1., 1.)],
dtype=[('f0', 'O'), ('f1', '<f4'), ('f2', '<f4')])
In [289]: rf.structured_to_unstructured(arr[['f1','f2']])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-289-8700aa9aacb4> in <module>
----> 1 rf.structured_to_unstructured(arr[['f1','f2']])
/usr/local/lib/python3.6/dist-packages/numpy/lib/recfunctions.py in structured_to_unstructured(arr, dtype, copy, casting)
969 with suppress_warnings() as sup: # until 1.16 (gh-12447)
970 sup.filter(FutureWarning, "Numpy has detected")
--> 971 arr = arr.view(flattened_fields)
972
973 # next cast to a packed format with all fields converted to new dtype
/usr/local/lib/python3.6/dist-packages/numpy/core/_internal.py in _view_is_safe(oldtype, newtype)
492
493 if newtype.hasobject or oldtype.hasobject:
--> 494 raise TypeError("Cannot change data-type for object array.")
495 return
496
TypeError: Cannot change data-type for object array.
转换前重新打包:
In [290]: rf.structured_to_unstructured(rf.repack_fields(arr[['f1','f2']]))
Out[290]:
array([[1., 1.],
[1., 1.],
[1., 1.]], dtype=float32)
多字段视图保留了基础数据布局。请注意此显示中 offsets
的使用。对象字段仍然存在,只是没有显示。
In [291]: arr[['f1','f2']]
Out[291]:
array([(1., 1.), (1., 1.), (1., 1.)],
dtype={'names':['f1','f2'], 'formats':['<f4','<f4'], 'offsets':[8,12], 'itemsize':16})
repack
制作不包含对象字段的副本:
In [292]: rf.repack_fields(arr[['f1','f2']])
Out[292]: array([(1., 1.), (1., 1.), (1., 1.)], dtype=[('f1', '<f4'), ('f2', '<f4')])
即使所有字段都是浮动的,view
方法也有问题:
In [301]: arr = np.ones(3, dtype='f,f,f')
In [302]: arr[['f1','f2']].view(('f',2))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-302-68433a44bcfe> in <module>
----> 1 arr[['f1','f2']].view(('f',2))
ValueError: Changing the dtype to a subarray type is only supported if the total itemsize is unchanged
In [303]: arr[['f1','f2']]
Out[303]:
array([(1., 1.), (1., 1.), (1., 1.)],
dtype={'names':['f1','f2'], 'formats':['<f4','<f4'], 'offsets':[4,8], 'itemsize':12})
In [304]: rf.repack_fields(arr[['f1','f2']]).view(('f',2))
Out[304]:
array([[1., 1.],
[1., 1.],
[1., 1.]], dtype=float32)
我想将具有相同 (np.float) 列的 NumPy 结构化数组转换为 Numpy 1.16.0 中的非结构化数组。
之前我是这样做的:
array = np.ones((100,), dtype=[('user', np.object), ('item', np.float), ('value', np.float)])
array[['item','value']].view((np.float, 2))
在 1.16.0 中,structured_to_unstructured
函数出现在 numpy.lib.recfunctions
。
但是对于包含对象列的数组的视图,新 structured_to_unstructured
和旧视图方式都会抛出 TypeError:
Cannot change data-type for object array.
对于完全没有对象列的结构化数组的视图,它工作正常,但如果视图只有由包含对象字段的数组制成的数字列,就会崩溃。
在 1.16 中,多视场视图的处理发生了重大变化。您需要使用 rf.repack_fields
来获得更早的行为。
In [277]: import numpy.lib.recfunctions as rf
In [287]: arr = np.ones(3, dtype='O,f,f')
In [288]: arr
Out[288]:
array([(1, 1., 1.), (1, 1., 1.), (1, 1., 1.)],
dtype=[('f0', 'O'), ('f1', '<f4'), ('f2', '<f4')])
In [289]: rf.structured_to_unstructured(arr[['f1','f2']])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-289-8700aa9aacb4> in <module>
----> 1 rf.structured_to_unstructured(arr[['f1','f2']])
/usr/local/lib/python3.6/dist-packages/numpy/lib/recfunctions.py in structured_to_unstructured(arr, dtype, copy, casting)
969 with suppress_warnings() as sup: # until 1.16 (gh-12447)
970 sup.filter(FutureWarning, "Numpy has detected")
--> 971 arr = arr.view(flattened_fields)
972
973 # next cast to a packed format with all fields converted to new dtype
/usr/local/lib/python3.6/dist-packages/numpy/core/_internal.py in _view_is_safe(oldtype, newtype)
492
493 if newtype.hasobject or oldtype.hasobject:
--> 494 raise TypeError("Cannot change data-type for object array.")
495 return
496
TypeError: Cannot change data-type for object array.
转换前重新打包:
In [290]: rf.structured_to_unstructured(rf.repack_fields(arr[['f1','f2']]))
Out[290]:
array([[1., 1.],
[1., 1.],
[1., 1.]], dtype=float32)
多字段视图保留了基础数据布局。请注意此显示中 offsets
的使用。对象字段仍然存在,只是没有显示。
In [291]: arr[['f1','f2']]
Out[291]:
array([(1., 1.), (1., 1.), (1., 1.)],
dtype={'names':['f1','f2'], 'formats':['<f4','<f4'], 'offsets':[8,12], 'itemsize':16})
repack
制作不包含对象字段的副本:
In [292]: rf.repack_fields(arr[['f1','f2']])
Out[292]: array([(1., 1.), (1., 1.), (1., 1.)], dtype=[('f1', '<f4'), ('f2', '<f4')])
即使所有字段都是浮动的,view
方法也有问题:
In [301]: arr = np.ones(3, dtype='f,f,f')
In [302]: arr[['f1','f2']].view(('f',2))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-302-68433a44bcfe> in <module>
----> 1 arr[['f1','f2']].view(('f',2))
ValueError: Changing the dtype to a subarray type is only supported if the total itemsize is unchanged
In [303]: arr[['f1','f2']]
Out[303]:
array([(1., 1.), (1., 1.), (1., 1.)],
dtype={'names':['f1','f2'], 'formats':['<f4','<f4'], 'offsets':[4,8], 'itemsize':12})
In [304]: rf.repack_fields(arr[['f1','f2']]).view(('f',2))
Out[304]:
array([[1., 1.],
[1., 1.],
[1., 1.]], dtype=float32)