h5py文件上传中,np.array(file[key][:])和np.array(file[key])有什么区别

In h5py file upload, what is the difference between np.array(file[key][:]) and np.array(file[key])

我目前正在研究 h5py 文件提取。当我 运行 下面的脚本时,它似乎输出相同的结果。有人知道区别吗?

train_dataset = h5py.File('datasets/train_happy.h5', "r")
train_set_x_orig1 = np.array(train_dataset["train_set_x"][:])

train_set_x_orig2 = np.array(train_dataset["train_set_y"]) 

感谢任何提供意见的人!

使用来自其他 SO 问题的示例文件

In [183]: f = h5py.File('temp.h5','r')
In [184]: list(f.keys())
Out[184]: ['db1', 'db2', 'db3', 'db4', 'temp']

简单的求key returns a Dataset(类似字典的操作)

In [185]: x = f['db1']
In [186]: type(x)
Out[186]: h5py._hl.dataset.Dataset
In [187]: x
Out[187]: <HDF5 dataset "db1": shape (5,), type "|V4">

添加 [:](或其他一些索引)足以将数据加载到数组中:

In [188]: y = f['db1'][:]
In [189]: type(y)
Out[189]: numpy.ndarray
In [190]: y
Out[190]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])

不需要进一步的 np.array 包装器。

http://docs.h5py.org/en/latest/high/dataset.html#reading-writing-data

value 属性也有效(我不确定这是在哪里记录的):

In [191]: x.value
Out[191]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])

array 包装器有效:

In [192]: np.array(x)
Out[192]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])

一组 timeits 没有显示出任何差异。

但在发行说明中:

http://docs.h5py.org/en/latest/whatsnew/2.1.html#dataset-value-property-is-now-deprecated

The property Dataset.value, which dates back to h5py 1.0, is deprecated and will be removed in a later release. This property dumps the entire dataset into a NumPy array. Code using .value should be updated to use NumPy indexing, using mydataset[...] or mydataset[()] as appropriate.