无法使用自然命名检索 PyTables 中的数据集

Question

我是 PyTables 的新手，我想使用自然命名从 HDF5 检索数据集，但我在使用此输入时遇到此错误：

f = tables.open_file("filename.h5", "r")

f.root.group-1.dataset-1.read()

组 / 没有名为 group

的 child

如果我尝试：

f.root.group\-1.dataset\-1.read()

组 / 没有名为 group

的 child

续行符后出现意外字符

我无法更改组中的名称，因为这是来自实验的大数据。

Answer 1

您不能在自然命名中使用减号（连字符），因为它不是 Python 变量名称的有效字符（group-1 和 dataset-1 看起来像减法操作！）看到这个讨论：
why-python-does-not-allow-hyphens

如果您有使用此命名约定的组和数据集，则必须使用 file.get_node() 方法来访问它们。这是一个简单的代码片段来演示。第一部分创建 2 个组和 tables（数据集）。 #1 使用 _，#2 在组中使用 - 和 table 名称。第二部分使用自然命名访问数据集 #1，使用 file.get_node()

访问数据集 #2

import tables as tb
import numpy as np

# Create h5 file with 2 groups and datasets:
# '/group_1', 'ds_1' : Natural Naming Supported
# '/group-2', 'ds-2' : Natural Naming NOT Supported
h5f = tb.open_file('SO_55211646.h5', 'w')

h5f.create_group('/', 'group_1')
h5f.create_group('/', 'group-2')

mydtype = np.dtype([('a',float),('b',float),('c',float)])
h5f.create_table('/group_1', 'ds_1', description=mydtype )
h5f.create_table('/group-2', 'ds-2', description=mydtype )

# Close, then Reopen file READ ONLY
h5f.close()

h5f = tb.open_file('SO_55211646.h5', 'r')

testds_1 = h5f.root.group_1.ds_1.read()
print (testds_1.dtype)

# these aren't valid Python statements:
#testds-2 = h5f.root.group-2.ds-2.read()
#print (testds-2.dtype)

testds_2 = h5f.get_node('/group-2','ds-2').read()
print (testds_2.dtype)

h5f.close()

无法使用自然命名检索 PyTables 中的数据集

Cannot retrieve Datasets in PyTables using natural naming

hdf5

data-analysis

pytables

python-3.x

data-science