如何使用添加的属性重塑数据框
How to reshape a dataframe with added attributes
我从 pandas_datareader 中提取的股票代码列表中有一个 Pandas 股票收盘价 DataFrame。
tickers = ['FB', 'AAPL', 'AMZN', 'RSP']
stocks = web.DataReader(tickers,'yahoo',start_date,end_date)
stocks
目前我的 DataFrame 如下所示:
我尝试将日期和符号设置为我的索引,如 `stocks.set_index(['Date','Symbols]) 但我收到此错误:
KeyError: "None of ['Date', 'Symbols'] are in the columns"
如何重塑 DataFrame 以便我的列按以下顺序排列:日期、符号、关闭
从pandas_datareader导入数据
tickers = ['FB', 'AAPL', 'AMZN', 'RSP']
start_date,end_date = "2020-04-01", "2020-04-10"
stocks = data.DataReader(tickers,'yahoo',start_date,end_date)
stocks_close = stocks['Close']
s1 = stocks_close.stack(0) # multi-index pandas dataframe
s2 = s1.reset_index() # no index, flatted and repeated
s1 给:
Date Symbols
2020-04-01 AAPL 240.910004
AMZN 1907.699951
FB 159.600006
RSP 79.879997
2020-04-02 AAPL 244.929993
AMZN 1918.829956
FB 158.190002
RSP 81.139999
2020-04-03 AAPL 241.410004
AMZN 1906.589966
FB 154.179993
RSP 79.830002
2020-04-06 AAPL 262.470001
AMZN 1997.589966
FB 165.550003
RSP 85.870003
2020-04-07 AAPL 259.429993
AMZN 2011.599976
FB 168.830002
RSP 86.559998
2020-04-08 AAPL 266.070007
AMZN 2043.000000
FB 174.279999
RSP 90.139999
2020-04-09 AAPL 267.989990
AMZN 2042.760010
FB 175.190002
RSP 92.220001
s2 :
Date Symbols 0
0 2020-04-01 AAPL 240.910004
1 2020-04-01 AMZN 1907.699951
2 2020-04-01 FB 159.600006
3 2020-04-01 RSP 79.879997
4 2020-04-02 AAPL 244.929993
5 2020-04-02 AMZN 1918.829956
6 2020-04-02 FB 158.190002
7 2020-04-02 RSP 81.139999
8 2020-04-03 AAPL 241.410004
9 2020-04-03 AMZN 1906.589966
10 2020-04-03 FB 154.179993
11 2020-04-03 RSP 79.830002
12 2020-04-06 AAPL 262.470001
13 2020-04-06 AMZN 1997.589966
14 2020-04-06 FB 165.550003
15 2020-04-06 RSP 85.870003
16 2020-04-07 AAPL 259.429993
17 2020-04-07 AMZN 2011.599976
18 2020-04-07 FB 168.830002
19 2020-04-07 RSP 86.559998
20 2020-04-08 AAPL 266.070007
21 2020-04-08 AMZN 2043.000000
22 2020-04-08 FB 174.279999
23 2020-04-08 RSP 90.139999
24 2020-04-09 AAPL 267.989990
25 2020-04-09 AMZN 2042.760010
26 2020-04-09 FB 175.190002
27 2020-04-09 RSP 92.220001
如果你想重命名 s2 的第三列:
s2.columns = ['Dates', 'Symbols', 'Close']
我从 pandas_datareader 中提取的股票代码列表中有一个 Pandas 股票收盘价 DataFrame。
tickers = ['FB', 'AAPL', 'AMZN', 'RSP']
stocks = web.DataReader(tickers,'yahoo',start_date,end_date)
stocks
目前我的 DataFrame 如下所示:
我尝试将日期和符号设置为我的索引,如 `stocks.set_index(['Date','Symbols]) 但我收到此错误:
KeyError: "None of ['Date', 'Symbols'] are in the columns"
如何重塑 DataFrame 以便我的列按以下顺序排列:日期、符号、关闭
从pandas_datareader导入数据
tickers = ['FB', 'AAPL', 'AMZN', 'RSP']
start_date,end_date = "2020-04-01", "2020-04-10"
stocks = data.DataReader(tickers,'yahoo',start_date,end_date)
stocks_close = stocks['Close']
s1 = stocks_close.stack(0) # multi-index pandas dataframe
s2 = s1.reset_index() # no index, flatted and repeated
s1 给:
Date Symbols
2020-04-01 AAPL 240.910004
AMZN 1907.699951
FB 159.600006
RSP 79.879997
2020-04-02 AAPL 244.929993
AMZN 1918.829956
FB 158.190002
RSP 81.139999
2020-04-03 AAPL 241.410004
AMZN 1906.589966
FB 154.179993
RSP 79.830002
2020-04-06 AAPL 262.470001
AMZN 1997.589966
FB 165.550003
RSP 85.870003
2020-04-07 AAPL 259.429993
AMZN 2011.599976
FB 168.830002
RSP 86.559998
2020-04-08 AAPL 266.070007
AMZN 2043.000000
FB 174.279999
RSP 90.139999
2020-04-09 AAPL 267.989990
AMZN 2042.760010
FB 175.190002
RSP 92.220001
s2 :
Date Symbols 0
0 2020-04-01 AAPL 240.910004
1 2020-04-01 AMZN 1907.699951
2 2020-04-01 FB 159.600006
3 2020-04-01 RSP 79.879997
4 2020-04-02 AAPL 244.929993
5 2020-04-02 AMZN 1918.829956
6 2020-04-02 FB 158.190002
7 2020-04-02 RSP 81.139999
8 2020-04-03 AAPL 241.410004
9 2020-04-03 AMZN 1906.589966
10 2020-04-03 FB 154.179993
11 2020-04-03 RSP 79.830002
12 2020-04-06 AAPL 262.470001
13 2020-04-06 AMZN 1997.589966
14 2020-04-06 FB 165.550003
15 2020-04-06 RSP 85.870003
16 2020-04-07 AAPL 259.429993
17 2020-04-07 AMZN 2011.599976
18 2020-04-07 FB 168.830002
19 2020-04-07 RSP 86.559998
20 2020-04-08 AAPL 266.070007
21 2020-04-08 AMZN 2043.000000
22 2020-04-08 FB 174.279999
23 2020-04-08 RSP 90.139999
24 2020-04-09 AAPL 267.989990
25 2020-04-09 AMZN 2042.760010
26 2020-04-09 FB 175.190002
27 2020-04-09 RSP 92.220001
如果你想重命名 s2 的第三列:
s2.columns = ['Dates', 'Symbols', 'Close']