在 Pandas 分类中,什么是 format="table"?

In a Pandas categorical, what is format="table"?

HDF5 格式显然不支持格式为 "fixed" 的分类。下面的例子

s = pd.Series(['a','b','a','b'],dtype='category')
s.to_hdf('s.h5','s')

Returns错误:

NotImplementedError: Cannot store a category dtype in a HDF5 dataset that uses format="fixed". Use format="table".

如何构建格式为'table'的分类系列?

pd.Series.to_hdf 中指定 format='table'format='t':

s.to_hdf('s.h5', key='s', format='t')

请注意,这也是错误消息所建议的。根据 the docs:

format : ‘fixed(f)|table(t)’, default is ‘fixed’

fixed(f) : Fixed format Fast writing/reading. Not-appendable, nor searchable

table(t) : Table format Write as a PyTables Table structure which may perform worse but allow more flexible operations like searching / selecting subsets of the data