如何从 HDF5 文件中提取数据以填充 PyTables table？

Question

我正在尝试在 Python 中编写一个 Discord 机器人。该机器人的目标是用用户的条目填充 table，其中检索用户名、游戏名称和游戏密码。然后，针对特定用户提取这些数据并删除已解决的条目。我使用在 google 上找到的第一个工具来管理 tables，因此 PyTables，我能够在 HDF5 文件中填充 table，但我无法检索它们。

重要的是要说我以前从未在 Python 中编码过。

这就是我声明对象并创建文件来存储条目的方式。

class DCParties (tables.IsDescription):
    user_name=StringCol(32)
    game_name=StringCol(16)
    game_pswd=StringCol(16) 


h5file = open_file("DCloneTable.h5", mode="w", title="DClone Table")    
group = h5file.create_group("/", 'DCloneEntries', "Entries for DClone runs")
table = h5file.create_table(group, 'Entries', DCParties, "Entrées")
h5file.close()

这是我填写条目的方式

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    particle['user_name'] = member.author
    particle['game_name'] = game_name
    particle['game_pswd'] = game_pswd
    particle.append()
    
    table.flush()
    h5file.close()

所有这些工作，我可以看到我的条目用 HDF5 查看器填充文件中的 table。但是，我想读取我的 table，存储在文件中，以提取数据，但它不起作用。

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    
    """???"""
    
    h5file.close()

我尝试使用粒子 ["user_name"]（因为 'user_name' 未定义），它给我 "b''" 作为输出

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    print(f'{particle["user_name"]}')
    
    h5file.close()

b''

如果我这样做

h5file = open_file("DCloneTable.h5", mode="a")
    table = h5file.root.DCloneEntries.Entries
    
    particle = table.row
    print(f'{particle["user_name"]} - {particle["game_name"]} - {particle["game_pswd"]}')
    
    h5file.close()

b'' - b'' - b''

我哪里失败了？非常感谢:)

Answer 1

这里有一个简单的方法来遍历 table 行并一次打印它们。 HDF5 不支持 Unicode 字符串，因此您的字符数据存储为字节字符串。这就是您看到 'b' 的原因。要摆脱 'b'，您必须使用 .decode('utf-8') 转换回 Unicode。这适用于您的硬编码字段名称。您可以使用 table.colnames 中的值来处理任何列名。此外，我建议使用 Python 的文件上下文管理器 (with/as:) 来避免让文件保持打开状态。

import tables as tb

with tb.open_file("DCloneTable.h5", mode="r") as h5file:
    table = h5file.root.DCloneEntries.Entries
    print(f'Table Column Names: {table.colnames}')

# Method to iterate over rows
    for row in table:
    print(f"{row['user_name'].decode('utf-8')} - " +
          f"{row['game_name'].decode('utf-8')} - " +
          f"{row['game_pswd'].decode('utf-8')}" )

# Method to only read the first row, aka table[0]
    print(f"{table[0]['user_name'].decode('utf-8')} - " +
          f"{table[0]['game_name'].decode('utf-8')} - " +
          f"{table[0]['game_pswd'].decode('utf-8')}" )

如果你更喜欢一次读取所有数据，可以使用table.read()方法将数据加载到一个NumPy结构化数组中。您仍然需要从字节转换为 Unicode。结果是“稍微复杂一点”，所以我没有post那个方法。

如何从 HDF5 文件中提取数据以填充 PyTables table？

How to extract data from HDF5 file to fill PyTables table?

python

hdf5

pytables