Pandas 面板到 DataFrame 索引
Pandas Panel to DataFrame indexing
我有以下代码:
import pandas as pd
import pandas_datareader.data as web
pdata = pd.Panel(dict((stk, web.get_data_yahoo(stk, '1/1/2009', '6/1/2012'))
for stk in ['AAPL', 'GOOG', 'MSFT']))
pdata
<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 861 (major_axis) x 3 (minor_axis)
Items axis: Open to Volume
Major_axis axis: 2009-01-02 00:00:00 to 2012-06-01 00:00:00
Minor_axis axis: AAPL to MSFT
AAPL GOOG MSFT
Date minor
2009-01-02 Open 12.268572 153.302917 19.530001
High 13.005714 159.870193 20.400000
Low 12.165714 151.762924 19.370001
Close 12.964286 159.621811 20.330000
Adj Close 11.621618 159.621811 16.140903
通常以下方法可以满足我的需求:
pdata = pdata.swapaxes('items', 'minor')
我收到以下警告:
Panel is deprecated and will be removed in a future version.
The recommended way to represent these types of 3-dimensional data are
with a MultiIndex on a DataFrame, via the Panel.to_frame() method
我的 objective 是有一个面板形式的数据框,使用日期和股票代码作为主要和次要行索引,以及开盘价等作为列:
minor Open High Low Close Adj Close
Date
2009-01-02 AAPL 12.268572 19.530001 12.165714 12.964286 11.621618
GOOG 153.302917 ... ... ... ...
MSFT 19.530001 ... ... ... ...
我确实将 Panel 对象转换为 DataFrame 并尝试使用 pivot_table 或 set_index 方法,但我无法将股票代码作为内行索引。当我在 DF 上使用 swapaxes 方法时,日期也被交换到列。有什么简单的方法可以获得我需要的格式吗?
选项 1
unstack
+ swaplevel
+ sort_index
pdata.to_frame().unstack(0).T\
.swaplevel(0, 1).sort_index(level=[0]).head(6)
minor Open High Low Close Adj Close \
Date
2009-01-02 AAPL 12.268572 13.005714 12.165714 12.964286 11.621618
GOOG 153.302917 159.870193 151.762924 159.621811 159.621811
MSFT 19.530001 20.400000 19.370001 20.330000 16.140903
2009-01-05 AAPL 13.310000 13.740000 13.244286 13.511429 12.112095
GOOG 159.462845 164.549759 156.482239 162.965073 162.965073
MSFT 20.200001 20.670000 20.059999 20.520000 16.291746
minor Volume
Date
2009-01-02 AAPL 186503800.0
GOOG 7267900.0
MSFT 50084000.0
2009-01-05 AAPL 295402100.0
GOOG 9841400.0
MSFT 61475200.0
选项 2
文的精彩stack
等价
pdata.to_frame().stack().unstack(-2).head(6)
minor Open High Low Close Adj Close \
Date
2009-01-02 AAPL 12.268572 13.005714 12.165714 12.964286 11.621618
GOOG 153.302917 159.870193 151.762924 159.621811 159.621811
MSFT 19.530001 20.400000 19.370001 20.330000 16.140903
2009-01-05 AAPL 13.310000 13.740000 13.244286 13.511429 12.112095
GOOG 159.462845 164.549759 156.482239 162.965073 162.965073
MSFT 20.200001 20.670000 20.059999 20.520000 16.291746
minor Volume
Date
2009-01-02 AAPL 186503800.0
GOOG 7267900.0
MSFT 50084000.0
2009-01-05 AAPL 295402100.0
GOOG 9841400.0
MSFT 61475200.0
我有以下代码:
import pandas as pd
import pandas_datareader.data as web
pdata = pd.Panel(dict((stk, web.get_data_yahoo(stk, '1/1/2009', '6/1/2012'))
for stk in ['AAPL', 'GOOG', 'MSFT']))
pdata
<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 861 (major_axis) x 3 (minor_axis)
Items axis: Open to Volume
Major_axis axis: 2009-01-02 00:00:00 to 2012-06-01 00:00:00
Minor_axis axis: AAPL to MSFT
AAPL GOOG MSFT
Date minor
2009-01-02 Open 12.268572 153.302917 19.530001
High 13.005714 159.870193 20.400000
Low 12.165714 151.762924 19.370001
Close 12.964286 159.621811 20.330000
Adj Close 11.621618 159.621811 16.140903
通常以下方法可以满足我的需求:
pdata = pdata.swapaxes('items', 'minor')
我收到以下警告:
Panel is deprecated and will be removed in a future version.
The recommended way to represent these types of 3-dimensional data are
with a MultiIndex on a DataFrame, via the Panel.to_frame() method
我的 objective 是有一个面板形式的数据框,使用日期和股票代码作为主要和次要行索引,以及开盘价等作为列:
minor Open High Low Close Adj Close
Date
2009-01-02 AAPL 12.268572 19.530001 12.165714 12.964286 11.621618
GOOG 153.302917 ... ... ... ...
MSFT 19.530001 ... ... ... ...
我确实将 Panel 对象转换为 DataFrame 并尝试使用 pivot_table 或 set_index 方法,但我无法将股票代码作为内行索引。当我在 DF 上使用 swapaxes 方法时,日期也被交换到列。有什么简单的方法可以获得我需要的格式吗?
选项 1
unstack
+ swaplevel
+ sort_index
pdata.to_frame().unstack(0).T\
.swaplevel(0, 1).sort_index(level=[0]).head(6)
minor Open High Low Close Adj Close \
Date
2009-01-02 AAPL 12.268572 13.005714 12.165714 12.964286 11.621618
GOOG 153.302917 159.870193 151.762924 159.621811 159.621811
MSFT 19.530001 20.400000 19.370001 20.330000 16.140903
2009-01-05 AAPL 13.310000 13.740000 13.244286 13.511429 12.112095
GOOG 159.462845 164.549759 156.482239 162.965073 162.965073
MSFT 20.200001 20.670000 20.059999 20.520000 16.291746
minor Volume
Date
2009-01-02 AAPL 186503800.0
GOOG 7267900.0
MSFT 50084000.0
2009-01-05 AAPL 295402100.0
GOOG 9841400.0
MSFT 61475200.0
选项 2
文的精彩stack
等价
pdata.to_frame().stack().unstack(-2).head(6)
minor Open High Low Close Adj Close \
Date
2009-01-02 AAPL 12.268572 13.005714 12.165714 12.964286 11.621618
GOOG 153.302917 159.870193 151.762924 159.621811 159.621811
MSFT 19.530001 20.400000 19.370001 20.330000 16.140903
2009-01-05 AAPL 13.310000 13.740000 13.244286 13.511429 12.112095
GOOG 159.462845 164.549759 156.482239 162.965073 162.965073
MSFT 20.200001 20.670000 20.059999 20.520000 16.291746
minor Volume
Date
2009-01-02 AAPL 186503800.0
GOOG 7267900.0
MSFT 50084000.0
2009-01-05 AAPL 295402100.0
GOOG 9841400.0
MSFT 61475200.0