HDFStore 更新存储的 HDF5 python pandas 数据帧
HDFStore updating stored HDF5 python pandas dataframe
我有两个数据帧,df1
存储在 pd.HDFStore
对象中,另一个要附加到数据帧。
store = pd.HDFStore('dataframe_store.h5')
df1 = pd.DataFrame(np.empty((100, 5)))
df2 = pd.DataFrame(np.empty((100, 5)))
store['df1'] = df1
实际上,我希望最终结果等于...
store['df1'] = df1.append(df2)
我想将 df2
附加到存储的 df1
,而不是用新数据框完全覆盖 HDFStore
对象。这可能吗?
此外,当我运行以下代码时,我returnValueError can only append to Tables
...这是为什么?
df = pd.DataFrame(np.empty((1000, 5)))
df2 = pd.DataFrame(np.empty((1000, 5)))
store = pd.HDFStore('store.h5')
store['df'] = df
store.append('df', df2)
根据 the docs(我强调):
HDFStore supports another PyTables format on disk, the table
format. Conceptually a table is shaped very much like a DataFrame, with rows and
columns. A table may be appended to in the same or other sessions. In addition,
delete & query type operations are supported. This format is specified by
format='table' or format='t' to append or put or to_hdf
New in version 0.13.
This format can be set as an option as well pd.set_option('io.hdf.default_format','table') to enable put/append/to_hdf to by default store in the table format.
In [361]: store = pd.HDFStore('store.h5')
In [362]: df1 = df[0:4]
In [363]: df2 = df[4:]
# append data (creates a table automatically)
In [364]: store.append('df', df1)
In [365]: store.append('df', df2)
In [366]: store
Out[366]:
<class 'pandas.io.pytables.HDFStore'>
File path: store.h5
# select the entire object
In [367]: store.select('df')
Out[367]:
A B C
2000-01-01 0.887163 0.859588 -0.636524
2000-01-02 0.015696 -2.242685 1.150036
2000-01-03 0.991946 0.953324 -2.021255
2000-01-04 -0.334077 0.002118 0.405453
2000-01-05 0.289092 1.321158 -1.546906
2000-01-06 -0.202646 -0.655969 0.193421
2000-01-07 0.553439 1.318152 -0.469305
2000-01-08 0.675554 -1.817027 -0.183109
# the type of stored data
In [368]: store.root.df._v_attrs.pandas_type
Out[368]: 'frame_table'
Note: You can also create a table by passing format='table' or format='t' to a put operation.
我有两个数据帧,df1
存储在 pd.HDFStore
对象中,另一个要附加到数据帧。
store = pd.HDFStore('dataframe_store.h5')
df1 = pd.DataFrame(np.empty((100, 5)))
df2 = pd.DataFrame(np.empty((100, 5)))
store['df1'] = df1
实际上,我希望最终结果等于...
store['df1'] = df1.append(df2)
我想将 df2
附加到存储的 df1
,而不是用新数据框完全覆盖 HDFStore
对象。这可能吗?
此外,当我运行以下代码时,我returnValueError can only append to Tables
...这是为什么?
df = pd.DataFrame(np.empty((1000, 5)))
df2 = pd.DataFrame(np.empty((1000, 5)))
store = pd.HDFStore('store.h5')
store['df'] = df
store.append('df', df2)
根据 the docs(我强调):
HDFStore supports another PyTables format on disk, the table format. Conceptually a table is shaped very much like a DataFrame, with rows and columns. A table may be appended to in the same or other sessions. In addition, delete & query type operations are supported. This format is specified by format='table' or format='t' to append or put or to_hdf
New in version 0.13.
This format can be set as an option as well pd.set_option('io.hdf.default_format','table') to enable put/append/to_hdf to by default store in the table format.
In [361]: store = pd.HDFStore('store.h5')
In [362]: df1 = df[0:4]
In [363]: df2 = df[4:]
# append data (creates a table automatically)
In [364]: store.append('df', df1)
In [365]: store.append('df', df2)
In [366]: store
Out[366]:
<class 'pandas.io.pytables.HDFStore'>
File path: store.h5
# select the entire object
In [367]: store.select('df')
Out[367]:
A B C
2000-01-01 0.887163 0.859588 -0.636524
2000-01-02 0.015696 -2.242685 1.150036
2000-01-03 0.991946 0.953324 -2.021255
2000-01-04 -0.334077 0.002118 0.405453
2000-01-05 0.289092 1.321158 -1.546906
2000-01-06 -0.202646 -0.655969 0.193421
2000-01-07 0.553439 1.318152 -0.469305
2000-01-08 0.675554 -1.817027 -0.183109
# the type of stored data
In [368]: store.root.df._v_attrs.pandas_type
Out[368]: 'frame_table'
Note: You can also create a table by passing format='table' or format='t' to a put operation.