h5py文件上传中,np.array(file[key][:])和np.array(file[key])有什么区别
In h5py file upload, what is the difference between np.array(file[key][:]) and np.array(file[key])
我目前正在研究 h5py 文件提取。当我 运行 下面的脚本时,它似乎输出相同的结果。有人知道区别吗?
train_dataset = h5py.File('datasets/train_happy.h5', "r")
train_set_x_orig1 = np.array(train_dataset["train_set_x"][:])
train_set_x_orig2 = np.array(train_dataset["train_set_y"])
感谢任何提供意见的人!
使用来自其他 SO 问题的示例文件
In [183]: f = h5py.File('temp.h5','r')
In [184]: list(f.keys())
Out[184]: ['db1', 'db2', 'db3', 'db4', 'temp']
简单的求key returns a Dataset
(类似字典的操作)
In [185]: x = f['db1']
In [186]: type(x)
Out[186]: h5py._hl.dataset.Dataset
In [187]: x
Out[187]: <HDF5 dataset "db1": shape (5,), type "|V4">
添加 [:]
(或其他一些索引)足以将数据加载到数组中:
In [188]: y = f['db1'][:]
In [189]: type(y)
Out[189]: numpy.ndarray
In [190]: y
Out[190]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])
不需要进一步的 np.array
包装器。
http://docs.h5py.org/en/latest/high/dataset.html#reading-writing-data
value
属性也有效(我不确定这是在哪里记录的):
In [191]: x.value
Out[191]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])
array
包装器有效:
In [192]: np.array(x)
Out[192]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])
一组 timeits
没有显示出任何差异。
但在发行说明中:
http://docs.h5py.org/en/latest/whatsnew/2.1.html#dataset-value-property-is-now-deprecated
The property Dataset.value, which dates back to h5py 1.0, is deprecated and will be removed in a later release. This property dumps the entire dataset into a NumPy array. Code using .value should be updated to use NumPy indexing, using mydataset[...] or mydataset[()] as appropriate.
我目前正在研究 h5py 文件提取。当我 运行 下面的脚本时,它似乎输出相同的结果。有人知道区别吗?
train_dataset = h5py.File('datasets/train_happy.h5', "r")
train_set_x_orig1 = np.array(train_dataset["train_set_x"][:])
train_set_x_orig2 = np.array(train_dataset["train_set_y"])
感谢任何提供意见的人!
使用来自其他 SO 问题的示例文件
In [183]: f = h5py.File('temp.h5','r')
In [184]: list(f.keys())
Out[184]: ['db1', 'db2', 'db3', 'db4', 'temp']
简单的求key returns a Dataset
(类似字典的操作)
In [185]: x = f['db1']
In [186]: type(x)
Out[186]: h5py._hl.dataset.Dataset
In [187]: x
Out[187]: <HDF5 dataset "db1": shape (5,), type "|V4">
添加 [:]
(或其他一些索引)足以将数据加载到数组中:
In [188]: y = f['db1'][:]
In [189]: type(y)
Out[189]: numpy.ndarray
In [190]: y
Out[190]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])
不需要进一步的 np.array
包装器。
http://docs.h5py.org/en/latest/high/dataset.html#reading-writing-data
value
属性也有效(我不确定这是在哪里记录的):
In [191]: x.value
Out[191]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])
array
包装器有效:
In [192]: np.array(x)
Out[192]: array([('a',), ('ab',), ('',), ('',), ('',)], dtype=[('str', 'O')])
一组 timeits
没有显示出任何差异。
但在发行说明中:
http://docs.h5py.org/en/latest/whatsnew/2.1.html#dataset-value-property-is-now-deprecated
The property Dataset.value, which dates back to h5py 1.0, is deprecated and will be removed in a later release. This property dumps the entire dataset into a NumPy array. Code using .value should be updated to use NumPy indexing, using mydataset[...] or mydataset[()] as appropriate.