如何使用 H5py 在 python 3 中正确打开、读取和保存到单个文件

How to properly open, read from and save to a single file in python 3 using H5py

我不熟悉 python 和一般编程,可能会犯可怕的错误。感谢您的任何帮助。我想通过加载其他人准备的一些 hdf5 数据或加载我自己的 hdf5 文件来初始化 class 的成员。我试过这个:

import numpy as np
import h5py
import sys

class ashot:
    def __init__(self, path, load=False):
        if load is False:
            self.name = "_".join(re.findall(r"(\d+)_(\d+)/aa/shot_(\d+)", path)[0])
            f = h5py.File(path, "r")
            numpyarray = f["data/data"]
            self.array = numpyarray
        else:
            f = h5py.File(path, "a")
            self.array = f["array"]
            self.name = f["array"].attrs["name"]

    def saveshot(self):
        s = h5py.File(self.name+".h5", "a")
        s.create_dataset("array", data=self.array)
        s["array"].attrs["name"] = self.name
        s.close()
        return()

但是如果我 运行 它使用:

testshot = ashot("somepath to data storage")
testshot.saveshot()
loadshot = ashot("the path I stored the shot testshot", load = True)
loadshot.saveshot()

我明白了

Traceback (most recent call last):
File "program path.py", line 191, in <module>
loadshot.saveshot()
File "program path.py", line 114, in saveshot
s.create_dataset("array", data=self.array)
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 109, in create_dataset
self[name] = dset
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 277, in __setitem__
h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5o.pyx", line 202, in h5py.h5o.link
RuntimeError: Unable to create link (name already exists)

我有点明白我正在尝试写入一个已经打开的文件,但由于某些原因,使用 numpy.save 和 numpy.load 的相同代码有效。我尝试在 assiningthe self.array 后关闭文件,但后来我得到

NameError: name 'ashot' is not defined

因为,我假设,此时 f 只是一个文件句柄。我究竟做错了什么?

不允许创建数据集两次:

In [34]: F = h5py.File('testh546643026.h5','a')
In [35]: ds = F.create_dataset('tst',data=np.arange(3))
In [36]: F.close()
In [37]: F = h5py.File('testh546643026.h5','a')
In [38]: ds = F.create_dataset('tst',data=np.arange(3))
....
RuntimeError: Unable to create link (Name already exists)

require 可以获取现有数据集(或创建一个新数据集),但 shape 和 dtype 必须匹配(参见其文档):

In [41]: ds = F.require_dataset('tst',(3,),int)
In [42]: ds
Out[42]: <HDF5 dataset "tst": shape (3,), type "<i4">
In [43]: ds.value
Out[43]: array([0, 1, 2])
In [44]: ds[:]=np.ones((3,))
In [45]: ds.value
Out[45]: array([1, 1, 1])

如果您想自由替换现有数据集,您必须先将其删除。