在 Pandas 分类中,什么是 format="table"?
In a Pandas categorical, what is format="table"?
HDF5 格式显然不支持格式为 "fixed" 的分类。下面的例子
s = pd.Series(['a','b','a','b'],dtype='category')
s.to_hdf('s.h5','s')
Returns错误:
NotImplementedError: Cannot store a category dtype in a HDF5 dataset that uses format="fixed". Use format="table".
如何构建格式为'table'的分类系列?
在 pd.Series.to_hdf
中指定 format='table'
或 format='t'
:
s.to_hdf('s.h5', key='s', format='t')
请注意,这也是错误消息所建议的。根据 the docs:
format : ‘fixed(f)|table(t)’, default is ‘fixed’
fixed(f) : Fixed format Fast writing/reading. Not-appendable, nor
searchable
table(t) : Table format Write as a PyTables Table structure which may
perform worse but allow more flexible operations like searching /
selecting subsets of the data
HDF5 格式显然不支持格式为 "fixed" 的分类。下面的例子
s = pd.Series(['a','b','a','b'],dtype='category')
s.to_hdf('s.h5','s')
Returns错误:
NotImplementedError: Cannot store a category dtype in a HDF5 dataset that uses format="fixed". Use format="table".
如何构建格式为'table'的分类系列?
在 pd.Series.to_hdf
中指定 format='table'
或 format='t'
:
s.to_hdf('s.h5', key='s', format='t')
请注意,这也是错误消息所建议的。根据 the docs:
format : ‘fixed(f)|table(t)’, default is ‘fixed’
fixed(f) : Fixed format Fast writing/reading. Not-appendable, nor searchable
table(t) : Table format Write as a PyTables Table structure which may perform worse but allow more flexible operations like searching / selecting subsets of the data