在 HDF5 数据集中编写二维 Numpy 数组
Writing a 2D Numpy Array in a HDF5 Dataset
所以我想使用 Python (H5Py) 将 2D Numpy 数组写入 HDF5 文件,但是我无法使其正常工作。这是数据集的样子
The Properties
The Data
这是代码
elements = {
'Ti': ['47Ti', '49Ti'],
'Cr': ['52Cr', '53Cr'],
'Fe': ['54Fe', '57Fe'],
'Mn': ['55Mn']}
# arg3: signalData
element_data = hdf5processor.process_signal_data(argv[3], elements)
#hdf5processor.plot_elements(element_data)
# arg4: outputFile
hdf5processor.save_dataset(argv[4], elements, element_data)
def save_dataset(filename, elements_list, element_data):
hf = h5py.File(filename, 'a')
elements_list_ascii = [n.encode("ascii", "ignore") for n in list(elements_list.keys())]
elements_list_dataset = hf.create_dataset("spWork/ElementList", (len(elements_list_ascii), 1), data=elements_list_ascii, dtype=h5py.string_dtype())
iostopes_used = np.array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'], ['Ti', '47Ti', '49Ti', 'Null', 'Null']])
iostopes_used_dataset = hf.create_dataset("spWork/IsotopesUsed", (2, 5), data=iostopes_used, dtype=h5py.string_dtype())
hf.close()
我正在尝试将 iostopes_used(2D Numpy 字符串数组)作为可变长度字符串保存到 HDF5 文件,就像第一张和第二张图片中那样。
http://docs.h5py.org/en/stable/strings.html 是相关的 h5py
文档页面。
In [652]: iostopes_used = np.array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'], ['Ti', '47Ti', '
...: 49Ti', 'Null', 'Null']])
...:
In [653]: iostopes_used
Out[653]:
array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'],
['Ti', '47Ti', '49Ti', 'Null', 'Null']], dtype='<U10')
In [654]: f = h5py.File('names.h5','w')
In [655]: iostopes_used_dataset = f.create_dataset("data", data=iostopes_used)
---------------------------------------------------------------------------
....
TypeError: No conversion path for dtype: dtype('<U10')
如果我们将 Py3 unicode 字符串转换为 Py2 字节字符串,则保存有效:
In [656]: iostopes_used_dataset = f.create_dataset("data", data=iostopes_used.astype('S10'))
In [657]: iostopes_used_dataset[:]
Out[657]:
array([[b'Element', b'Isotope(s)', b'Null', b'Null', b'Null'],
[b'Ti', b'47Ti', b'49Ti', b'Null', b'Null']], dtype='|S10')
In [658]: f.close()
===
另一条路线是可变长度字符串对象,如最近的 SO 问题所示:
In [663]: dt = h5py.special_dtype(vlen=str)
In [665]: f.create_dataset('other', iostopes_used.shape, dtype=dt)
Out[665]: <HDF5 dataset "other": shape (2, 5), type "|O">
In [666]: _[:] = iostopes_used
In [667]: _[:]
Out[667]:
array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'],
['Ti', '47Ti', '49Ti', 'Null', 'Null']], dtype=object)
所以我想使用 Python (H5Py) 将 2D Numpy 数组写入 HDF5 文件,但是我无法使其正常工作。这是数据集的样子
The Properties
The Data
这是代码
elements = {
'Ti': ['47Ti', '49Ti'],
'Cr': ['52Cr', '53Cr'],
'Fe': ['54Fe', '57Fe'],
'Mn': ['55Mn']}
# arg3: signalData
element_data = hdf5processor.process_signal_data(argv[3], elements)
#hdf5processor.plot_elements(element_data)
# arg4: outputFile
hdf5processor.save_dataset(argv[4], elements, element_data)
def save_dataset(filename, elements_list, element_data):
hf = h5py.File(filename, 'a')
elements_list_ascii = [n.encode("ascii", "ignore") for n in list(elements_list.keys())]
elements_list_dataset = hf.create_dataset("spWork/ElementList", (len(elements_list_ascii), 1), data=elements_list_ascii, dtype=h5py.string_dtype())
iostopes_used = np.array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'], ['Ti', '47Ti', '49Ti', 'Null', 'Null']])
iostopes_used_dataset = hf.create_dataset("spWork/IsotopesUsed", (2, 5), data=iostopes_used, dtype=h5py.string_dtype())
hf.close()
我正在尝试将 iostopes_used(2D Numpy 字符串数组)作为可变长度字符串保存到 HDF5 文件,就像第一张和第二张图片中那样。
http://docs.h5py.org/en/stable/strings.html 是相关的 h5py
文档页面。
In [652]: iostopes_used = np.array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'], ['Ti', '47Ti', '
...: 49Ti', 'Null', 'Null']])
...:
In [653]: iostopes_used
Out[653]:
array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'],
['Ti', '47Ti', '49Ti', 'Null', 'Null']], dtype='<U10')
In [654]: f = h5py.File('names.h5','w')
In [655]: iostopes_used_dataset = f.create_dataset("data", data=iostopes_used)
---------------------------------------------------------------------------
....
TypeError: No conversion path for dtype: dtype('<U10')
如果我们将 Py3 unicode 字符串转换为 Py2 字节字符串,则保存有效:
In [656]: iostopes_used_dataset = f.create_dataset("data", data=iostopes_used.astype('S10'))
In [657]: iostopes_used_dataset[:]
Out[657]:
array([[b'Element', b'Isotope(s)', b'Null', b'Null', b'Null'],
[b'Ti', b'47Ti', b'49Ti', b'Null', b'Null']], dtype='|S10')
In [658]: f.close()
===
另一条路线是可变长度字符串对象,如最近的 SO 问题所示:
In [663]: dt = h5py.special_dtype(vlen=str)
In [665]: f.create_dataset('other', iostopes_used.shape, dtype=dt)
Out[665]: <HDF5 dataset "other": shape (2, 5), type "|O">
In [666]: _[:] = iostopes_used
In [667]: _[:]
Out[667]:
array([['Element', 'Isotope(s)', 'Null', 'Null', 'Null'],
['Ti', '47Ti', '49Ti', 'Null', 'Null']], dtype=object)