如何使用 H5py 在 python 3 中正确打开、读取和保存到单个文件
How to properly open, read from and save to a single file in python 3 using H5py
我不熟悉 python 和一般编程,可能会犯可怕的错误。感谢您的任何帮助。我想通过加载其他人准备的一些 hdf5 数据或加载我自己的 hdf5 文件来初始化 class 的成员。我试过这个:
import numpy as np
import h5py
import sys
class ashot:
def __init__(self, path, load=False):
if load is False:
self.name = "_".join(re.findall(r"(\d+)_(\d+)/aa/shot_(\d+)", path)[0])
f = h5py.File(path, "r")
numpyarray = f["data/data"]
self.array = numpyarray
else:
f = h5py.File(path, "a")
self.array = f["array"]
self.name = f["array"].attrs["name"]
def saveshot(self):
s = h5py.File(self.name+".h5", "a")
s.create_dataset("array", data=self.array)
s["array"].attrs["name"] = self.name
s.close()
return()
但是如果我 运行 它使用:
testshot = ashot("somepath to data storage")
testshot.saveshot()
loadshot = ashot("the path I stored the shot testshot", load = True)
loadshot.saveshot()
我明白了
Traceback (most recent call last):
File "program path.py", line 191, in <module>
loadshot.saveshot()
File "program path.py", line 114, in saveshot
s.create_dataset("array", data=self.array)
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 109, in create_dataset
self[name] = dset
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 277, in __setitem__
h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5o.pyx", line 202, in h5py.h5o.link
RuntimeError: Unable to create link (name already exists)
我有点明白我正在尝试写入一个已经打开的文件,但由于某些原因,使用 numpy.save 和 numpy.load 的相同代码有效。我尝试在 assiningthe self.array 后关闭文件,但后来我得到
NameError: name 'ashot' is not defined
因为,我假设,此时 f 只是一个文件句柄。我究竟做错了什么?
不允许创建数据集两次:
In [34]: F = h5py.File('testh546643026.h5','a')
In [35]: ds = F.create_dataset('tst',data=np.arange(3))
In [36]: F.close()
In [37]: F = h5py.File('testh546643026.h5','a')
In [38]: ds = F.create_dataset('tst',data=np.arange(3))
....
RuntimeError: Unable to create link (Name already exists)
require
可以获取现有数据集(或创建一个新数据集),但 shape 和 dtype 必须匹配(参见其文档):
In [41]: ds = F.require_dataset('tst',(3,),int)
In [42]: ds
Out[42]: <HDF5 dataset "tst": shape (3,), type "<i4">
In [43]: ds.value
Out[43]: array([0, 1, 2])
In [44]: ds[:]=np.ones((3,))
In [45]: ds.value
Out[45]: array([1, 1, 1])
如果您想自由替换现有数据集,您必须先将其删除。
我不熟悉 python 和一般编程,可能会犯可怕的错误。感谢您的任何帮助。我想通过加载其他人准备的一些 hdf5 数据或加载我自己的 hdf5 文件来初始化 class 的成员。我试过这个:
import numpy as np
import h5py
import sys
class ashot:
def __init__(self, path, load=False):
if load is False:
self.name = "_".join(re.findall(r"(\d+)_(\d+)/aa/shot_(\d+)", path)[0])
f = h5py.File(path, "r")
numpyarray = f["data/data"]
self.array = numpyarray
else:
f = h5py.File(path, "a")
self.array = f["array"]
self.name = f["array"].attrs["name"]
def saveshot(self):
s = h5py.File(self.name+".h5", "a")
s.create_dataset("array", data=self.array)
s["array"].attrs["name"] = self.name
s.close()
return()
但是如果我 运行 它使用:
testshot = ashot("somepath to data storage")
testshot.saveshot()
loadshot = ashot("the path I stored the shot testshot", load = True)
loadshot.saveshot()
我明白了
Traceback (most recent call last):
File "program path.py", line 191, in <module>
loadshot.saveshot()
File "program path.py", line 114, in saveshot
s.create_dataset("array", data=self.array)
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 109, in create_dataset
self[name] = dset
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 277, in __setitem__
h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5o.pyx", line 202, in h5py.h5o.link
RuntimeError: Unable to create link (name already exists)
我有点明白我正在尝试写入一个已经打开的文件,但由于某些原因,使用 numpy.save 和 numpy.load 的相同代码有效。我尝试在 assiningthe self.array 后关闭文件,但后来我得到
NameError: name 'ashot' is not defined
因为,我假设,此时 f 只是一个文件句柄。我究竟做错了什么?
不允许创建数据集两次:
In [34]: F = h5py.File('testh546643026.h5','a')
In [35]: ds = F.create_dataset('tst',data=np.arange(3))
In [36]: F.close()
In [37]: F = h5py.File('testh546643026.h5','a')
In [38]: ds = F.create_dataset('tst',data=np.arange(3))
....
RuntimeError: Unable to create link (Name already exists)
require
可以获取现有数据集(或创建一个新数据集),但 shape 和 dtype 必须匹配(参见其文档):
In [41]: ds = F.require_dataset('tst',(3,),int)
In [42]: ds
Out[42]: <HDF5 dataset "tst": shape (3,), type "<i4">
In [43]: ds.value
Out[43]: array([0, 1, 2])
In [44]: ds[:]=np.ones((3,))
In [45]: ds.value
Out[45]: array([1, 1, 1])
如果您想自由替换现有数据集,您必须先将其删除。