Python HDF5 属性
Python HDF5 Attributes
我正在尝试将测量属性保存在 HDF5 文件中。我花了很多时间处理使用格式制作的文件,其中在单个属性条目中似乎有一组具有不同数据类型的属性。
例如,对于我的文件,命令
f = h5py.File('test.data','r+')
f['Measurement/Surface'].attrs['X Converter']
生产
array([(b'LateralCat', b'Pixels', array([0. , 2.00097752, 0. , 0. ]))],
dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
这里,前两项是字符串,第三项是数组。现在,如果我尝试将值保存到不同的文件:
f1 = h5py.File('test_output.data','r+')
f1['Measurement/Surface'].attrs.create('X Converter',[(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0. , 0. ]))])
我收到这个错误:
Traceback (most recent call last): File
"<pyshell#94>", line 1, in
f1['Measurement/Surface'].attrs.create('X Converter',[(b'LateralCat', b'Pixels', np.array([0. ,
2.00097752, 0. , 0. ]))]) File "C:\WinPython\WinPython-64bit-3.6.3.0Zero\python-3.6.3.amd64\lib\site-packages\h5py_hl\attrs.py",
line 171, in create
htype = h5t.py_create(original_dtype, logical=True) File "h5py\h5t.pyx", line 1611, in h5py.h5t.py_create File
"h5py\h5t.pyx", line 1633, in h5py.h5t.py_create File
"h5py\h5t.pyx", line 1688, in h5py.h5t.py_create TypeError: Object
dtype dtype('O') has no native HDF5 equivalent
我错过了什么?
你保存的不是同一个东西。原文的dtype
意义重大
In [101]: [(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0. ,
...: 0. ]))]
Out[101]:
[(b'LateralCat',
b'Pixels',
array([0. , 2.00097752, 0. , 0. ]))]
In [102]: np.array(_)
<ipython-input-102-7a2cd91c32ca>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
np.array(_)
Out[102]:
array([[b'LateralCat', b'Pixels',
array([0. , 2.00097752, 0. , 0. ])]],
dtype=object)
In [104]: np.array([(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0.
...: , 0. ]))],
...: dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
Out[104]:
array([(b'LateralCat', b'Pixels', array([0. , 2.00097752, 0. , 0. ]))],
dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
In [105]: x = _
In [106]: x.dtype
Out[106]: dtype([('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
In [108]: x['Category']
Out[108]: array([b'LateralCat'], dtype=object)
In [109]: x['BaseUnit']
Out[109]: array([b'Pixels'], dtype=object)
In [110]: x['Parameters']
Out[110]:
array([array([0. , 2.00097752, 0. , 0. ])],
dtype=object)
虽然这并没有完全解决问题,因为 dtype 仍然包含对象 dtype 字段。
In [111]: import h5py
In [112]: f=h5py.File('test.h5','w')
In [113]:
In [113]: g = f.create_group('test')
In [114]: g.attrs.create('converter',x)
Traceback (most recent call last):
...
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
如评论中所述,numpy
对象数据类型在写入 h5py
时存在问题。你知道原始文件是如何创建的吗?那里可能有一些格式或结构 h5py
呈现为具有对象字段的复合 dtype,但它不是直接可写的。我必须深入研究文档(也许还有原始文件)才能了解更多信息。
https://docs.h5py.org/en/stable/special.html
我可以将该数据写成更传统的结构化数组:
In [120]: y=np.array([(b'LateralCat', b'Pixels', np.array([0. , 2.00097752,
...: 0. , 0. ]))],
...: dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', 'fl
...: oat',4)])
In [121]: y
Out[121]:
array([(b'LateralCat', b'Pixels', [0. , 2.00097752, 0. , 0. ])],
dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', '<f8', (4,))])
In [122]: g.attrs.create('converter',y)
In [125]: g.attrs['converter']
Out[125]:
array([(b'LateralCat', b'Pixels', [0. , 2.00097752, 0. , 0. ])],
dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', '<f8', (4,))])
我正在尝试将测量属性保存在 HDF5 文件中。我花了很多时间处理使用格式制作的文件,其中在单个属性条目中似乎有一组具有不同数据类型的属性。
例如,对于我的文件,命令
f = h5py.File('test.data','r+')
f['Measurement/Surface'].attrs['X Converter']
生产
array([(b'LateralCat', b'Pixels', array([0. , 2.00097752, 0. , 0. ]))],
dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
这里,前两项是字符串,第三项是数组。现在,如果我尝试将值保存到不同的文件:
f1 = h5py.File('test_output.data','r+')
f1['Measurement/Surface'].attrs.create('X Converter',[(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0. , 0. ]))])
我收到这个错误:
Traceback (most recent call last): File "<pyshell#94>", line 1, in f1['Measurement/Surface'].attrs.create('X Converter',[(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0. , 0. ]))]) File "C:\WinPython\WinPython-64bit-3.6.3.0Zero\python-3.6.3.amd64\lib\site-packages\h5py_hl\attrs.py", line 171, in create htype = h5t.py_create(original_dtype, logical=True) File "h5py\h5t.pyx", line 1611, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1633, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1688, in h5py.h5t.py_create TypeError: Object dtype dtype('O') has no native HDF5 equivalent
我错过了什么?
你保存的不是同一个东西。原文的dtype
意义重大
In [101]: [(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0. ,
...: 0. ]))]
Out[101]:
[(b'LateralCat',
b'Pixels',
array([0. , 2.00097752, 0. , 0. ]))]
In [102]: np.array(_)
<ipython-input-102-7a2cd91c32ca>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
np.array(_)
Out[102]:
array([[b'LateralCat', b'Pixels',
array([0. , 2.00097752, 0. , 0. ])]],
dtype=object)
In [104]: np.array([(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0.
...: , 0. ]))],
...: dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
Out[104]:
array([(b'LateralCat', b'Pixels', array([0. , 2.00097752, 0. , 0. ]))],
dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
In [105]: x = _
In [106]: x.dtype
Out[106]: dtype([('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
In [108]: x['Category']
Out[108]: array([b'LateralCat'], dtype=object)
In [109]: x['BaseUnit']
Out[109]: array([b'Pixels'], dtype=object)
In [110]: x['Parameters']
Out[110]:
array([array([0. , 2.00097752, 0. , 0. ])],
dtype=object)
虽然这并没有完全解决问题,因为 dtype 仍然包含对象 dtype 字段。
In [111]: import h5py
In [112]: f=h5py.File('test.h5','w')
In [113]:
In [113]: g = f.create_group('test')
In [114]: g.attrs.create('converter',x)
Traceback (most recent call last):
...
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
如评论中所述,numpy
对象数据类型在写入 h5py
时存在问题。你知道原始文件是如何创建的吗?那里可能有一些格式或结构 h5py
呈现为具有对象字段的复合 dtype,但它不是直接可写的。我必须深入研究文档(也许还有原始文件)才能了解更多信息。
https://docs.h5py.org/en/stable/special.html
我可以将该数据写成更传统的结构化数组:
In [120]: y=np.array([(b'LateralCat', b'Pixels', np.array([0. , 2.00097752,
...: 0. , 0. ]))],
...: dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', 'fl
...: oat',4)])
In [121]: y
Out[121]:
array([(b'LateralCat', b'Pixels', [0. , 2.00097752, 0. , 0. ])],
dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', '<f8', (4,))])
In [122]: g.attrs.create('converter',y)
In [125]: g.attrs['converter']
Out[125]:
array([(b'LateralCat', b'Pixels', [0. , 2.00097752, 0. , 0. ])],
dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', '<f8', (4,))])