无法访问返回的 h5py 对象实例

Question

我这里有一个很奇怪的问题。我有 2 个函数：一个读取使用 h5py 创建的 HDF5 文件，另一个创建一个新的 HDF5 文件，该文件连接前一个函数返回的内容。

def read_file(filename):
    with h5py.File(filename+".hdf5",'r') as hf:

        group1 = hf.get('group1')
        group1 = hf.get('group2')            
        dataset1 = hf.get('dataset1')
        dataset2 = hf.get('dataset2')
        print group1.attrs['w'] # Works here

        return dataset1, dataset2, group1, group1

以及创建文件函数

def create_chunk(start_index, end_index):

    for i in range(start_index, end_index):
        if i == start_index:
            mergedhf = h5py.File("output.hdf5",'w')
            mergedhf.create_dataset("dataset1",dtype='float64')
            mergedhf.create_dataset("dataset2",dtype='float64')

            g1 = mergedhf.create_group('group1')
            g2 = mergedhf.create_group('group2')

    rd1,rd2,rg1,rg2 = read_file(filename)

    print rg1.attrs['w'] #gives me <Closed HDF5 group> message

    g1.attrs['w'] = "content"
    g1.attrs['x'] = "content"
    g2.attrs['y'] = "content"
    g2.attrs['z'] = "content"
    print g1.attrs['w'] # Works Here
return mergedhf.get('dataset1'), mergedhf.get('dataset2'), g1, g2

def calling_function():
    wd1, wd2, wg1, wg2 = create_chunk(start_index, end_index)
    print wg1.attrs['w'] #Works here as well

现在的问题是，我可以访问由 wd1、wd2、wg1 和 wg2 创建和表示的新文件中的数据集和属性，我可以访问属性数据，但我不能这样做我已阅读并返回了值。

当我返回对调用函数的引用时，任何人都可以帮我获取数据集和组的值吗？

Answer 1

问题出在read_file这一行：

with h5py.File(filename+".hdf5",'r') as hf:

这会在 with 块的末尾关闭 hf，即 read_file return 时。发生这种情况时，数据集和组也会关闭，您将无法再访问它们。

有（至少）两种方法可以解决这个问题。首先，您可以像在 create_chunk:

中那样打开文件

hf = h5py.File(filename+".hdf5", 'r')

并在需要时保留对 hf 的引用，在关闭它之前：

hf.close()

另一种方法是从 read_file 和 return 中的数据集中复制数据：

dataset1 = hf.get('dataset1')[:]
dataset2 = hf.get('dataset2')[:]

请注意，您不能对群组执行此操作。只要您需要对组执行操作，文件就需要打开。

Answer 2

添加到@Yossarian 的

The problem is in read_file, this line: with h5py.File(filename+".hdf5",'r') as hf: This closes hf at the end of the with block, i.e. when read_file returns. When this happens, the datasets and groups also get closed and you can no longer access them.

对于那些遇到这个问题并正在阅读标量数据集的人，请确保使用 [()]:

建立索引

scalar_dataset1 = hf['scalar_dataset1'][()]

前言

我遇到了与 OP 类似的问题，导致 return 值为 <closed hdf5 dataset>。但是，当我尝试使用 [:].

对我的标量数据集进行切片时，我会得到一个 ValueError

"ValueError: Illegal slicing argument for scalar dataspace"

使用 [()] 建立索引以及@Yossarian 的回答帮助解决了我的问题。

无法访问返回的 h5py 对象实例

Can't access returned h5py object instance

python

hdf5

python-2.7

h5py

添加到@Yossarian 的

前言