连接几个股票价格数据框
Concatenate several stock price dataframes
我正在使用 pandas_datareader 从雅虎获取月度价格数据,如下所示:
import pandas_datareader.data as web
fb = web.get_data_yahoo('FB', '06/01/2012', interval='m')
amzn = web.get_data_yahoo('AMZN', '06/01/2012', interval='m')
nflx = web.get_data_yahoo('NFLX', '06/01/2012', interval='m')
goog = web.get_data_yahoo('GOOG', '06/01/2012', interval='m')
然后我清理它以获得这样的收盘价:
import pandas as pd
amzn = amzn.rename(columns={'Adj Close': 'AMZN'})
amzn = pd.DataFrame(amzn['AMZN'], columns=['AMZN'])
对所有四个数据帧重复清理。完成后,我想将这四个数据框合并在一起。为此,我使用:
data = pd.concat([fb, amzn, nlfx, goog])
然而,这会导致一个数据框,其中四列数据中只有三列是 NaN。我已确认日期匹配。为什么会这样?任何见解表示赞赏。
有更好的方法 - 使用 Pandas.Panel:
In [20]: p = web.get_data_yahoo(['FB','AMZN','NFLX','GOOG'], '06/01/2012', interval='m')
In [21]: p.loc['Adj Close']
Out[21]:
AMZN FB GOOG NFLX
Date
2012-06-01 228.350006 31.100000 289.745758 9.784286
2012-07-02 233.300003 21.709999 316.169373 8.121428
2012-08-01 248.270004 18.059999 342.203369 8.531428
2012-09-04 254.320007 21.660000 376.873779 7.777143
2012-10-01 232.889999 21.110001 339.810760 11.320000
2012-11-01 252.050003 28.000000 348.836761 11.672857
2012-12-03 250.869995 26.620001 353.337280 13.227143
2013-01-02 265.500000 30.980000 377.468170 23.605715
2013-02-01 264.269989 27.250000 400.200500 26.868572
2013-03-01 266.489990 25.580000 396.698975 27.040001
2013-04-01 253.809998 27.770000 411.873840 30.867144
2013-05-01 269.200012 24.350000 435.175598 32.321430
2013-06-03 277.690002 24.879999 439.746002 30.155714
2013-07-01 301.220001 36.799999 443.432343 34.925713
2013-08-01 280.980011 41.290001 423.027710 40.558571
2013-09-03 312.640015 50.230000 437.518250 44.172855
2013-10-01 364.029999 50.209999 514.776123 46.068573
2013-11-01 393.619995 47.009998 529.266602 52.257141
2013-12-02 398.790009 54.650002 559.796204 52.595715
2014-01-02 358.690002 62.570000 589.896118 58.475716
2014-02-03 362.100006 68.459999 607.218811 63.661430
2014-03-03 336.369995 60.240002 556.972473 50.290001
2014-04-01 304.130005 59.779999 526.662415 46.005714
2014-05-01 312.549988 63.299999 559.892578 59.689999
2014-06-02 324.779999 67.290001 575.282593 62.942856
... ... ... ... ...
2015-02-02 380.160004 78.970001 558.402527 67.844284
2015-03-02 372.100006 82.220001 548.002441 59.527142
2015-04-01 421.779999 78.769997 537.340027 79.500000
2015-05-01 429.230011 79.190002 532.109985 89.151428
2015-06-01 434.089996 85.769997 520.510010 93.848572
2015-07-01 536.150024 94.010002 625.609985 114.309998
2015-08-03 512.890015 89.430000 618.250000 115.029999
2015-09-01 511.890015 89.900002 608.419983 103.260002
2015-10-01 625.900024 101.970001 710.809998 108.379997
2015-11-02 664.799988 104.239998 742.599976 123.330002
2015-12-01 675.890015 104.660004 758.880005 114.379997
2016-01-04 587.000000 112.209999 742.950012 91.839996
2016-02-01 552.520020 106.919998 697.770020 93.410004
2016-03-01 593.640015 114.099998 744.950012 102.230003
2016-04-01 659.590027 117.580002 693.010010 90.029999
2016-05-02 722.789978 118.809998 735.719971 102.570000
2016-06-01 715.619995 114.279999 692.099976 91.480003
2016-07-01 758.809998 123.940002 768.789978 91.250000
2016-08-01 769.159973 126.120003 767.049988 97.449997
2016-09-01 837.309998 128.270004 777.289978 98.550003
2016-10-03 789.820007 130.990005 784.539978 124.870003
2016-11-01 750.570007 118.419998 758.039978 117.000000
2016-12-01 749.869995 115.050003 771.820007 123.800003
2017-01-03 823.479980 130.320007 796.789978 140.710007
2017-02-01 807.640015 132.059998 801.340027 140.970001
[57 rows x 4 columns]
面板轴:
In [22]: p.axes
Out[22]:
[Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object'),
DatetimeIndex(['2012-06-01', '2012-07-02', '2012-08-01', '2012-09-04', '2012-10-01', '2012-11-01', '2012-12-03', '2013-01-02', '2013-02-01'
, '2013-03-01', '2013-04-01', '2013-05-01', '2013-06-03',
'2013-07-01', '2013-08-01', '2013-09-03', '2013-10-01', '2013-11-01', '2013-12-02', '2014-01-02', '2014-02-03', '2014-03-03'
, '2014-04-01', '2014-05-01', '2014-06-02', '2014-07-01',
'2014-08-01', '2014-09-02', '2014-10-01', '2014-11-03', '2014-12-01', '2015-01-02', '2015-02-02', '2015-03-02', '2015-04-01'
, '2015-05-01', '2015-06-01', '2015-07-01', '2015-08-03',
'2015-09-01', '2015-10-01', '2015-11-02', '2015-12-01', '2016-01-04', '2016-02-01', '2016-03-01', '2016-04-01', '2016-05-02'
, '2016-06-01', '2016-07-01', '2016-08-01', '2016-09-01',
'2016-10-03', '2016-11-01', '2016-12-01', '2017-01-03', '2017-02-01'],
dtype='datetime64[ns]', name='Date', freq=None),
Index(['AMZN', 'FB', 'GOOG', 'NFLX'], dtype='object')]
我正在使用 pandas_datareader 从雅虎获取月度价格数据,如下所示:
import pandas_datareader.data as web
fb = web.get_data_yahoo('FB', '06/01/2012', interval='m')
amzn = web.get_data_yahoo('AMZN', '06/01/2012', interval='m')
nflx = web.get_data_yahoo('NFLX', '06/01/2012', interval='m')
goog = web.get_data_yahoo('GOOG', '06/01/2012', interval='m')
然后我清理它以获得这样的收盘价:
import pandas as pd
amzn = amzn.rename(columns={'Adj Close': 'AMZN'})
amzn = pd.DataFrame(amzn['AMZN'], columns=['AMZN'])
对所有四个数据帧重复清理。完成后,我想将这四个数据框合并在一起。为此,我使用:
data = pd.concat([fb, amzn, nlfx, goog])
然而,这会导致一个数据框,其中四列数据中只有三列是 NaN。我已确认日期匹配。为什么会这样?任何见解表示赞赏。
有更好的方法 - 使用 Pandas.Panel:
In [20]: p = web.get_data_yahoo(['FB','AMZN','NFLX','GOOG'], '06/01/2012', interval='m')
In [21]: p.loc['Adj Close']
Out[21]:
AMZN FB GOOG NFLX
Date
2012-06-01 228.350006 31.100000 289.745758 9.784286
2012-07-02 233.300003 21.709999 316.169373 8.121428
2012-08-01 248.270004 18.059999 342.203369 8.531428
2012-09-04 254.320007 21.660000 376.873779 7.777143
2012-10-01 232.889999 21.110001 339.810760 11.320000
2012-11-01 252.050003 28.000000 348.836761 11.672857
2012-12-03 250.869995 26.620001 353.337280 13.227143
2013-01-02 265.500000 30.980000 377.468170 23.605715
2013-02-01 264.269989 27.250000 400.200500 26.868572
2013-03-01 266.489990 25.580000 396.698975 27.040001
2013-04-01 253.809998 27.770000 411.873840 30.867144
2013-05-01 269.200012 24.350000 435.175598 32.321430
2013-06-03 277.690002 24.879999 439.746002 30.155714
2013-07-01 301.220001 36.799999 443.432343 34.925713
2013-08-01 280.980011 41.290001 423.027710 40.558571
2013-09-03 312.640015 50.230000 437.518250 44.172855
2013-10-01 364.029999 50.209999 514.776123 46.068573
2013-11-01 393.619995 47.009998 529.266602 52.257141
2013-12-02 398.790009 54.650002 559.796204 52.595715
2014-01-02 358.690002 62.570000 589.896118 58.475716
2014-02-03 362.100006 68.459999 607.218811 63.661430
2014-03-03 336.369995 60.240002 556.972473 50.290001
2014-04-01 304.130005 59.779999 526.662415 46.005714
2014-05-01 312.549988 63.299999 559.892578 59.689999
2014-06-02 324.779999 67.290001 575.282593 62.942856
... ... ... ... ...
2015-02-02 380.160004 78.970001 558.402527 67.844284
2015-03-02 372.100006 82.220001 548.002441 59.527142
2015-04-01 421.779999 78.769997 537.340027 79.500000
2015-05-01 429.230011 79.190002 532.109985 89.151428
2015-06-01 434.089996 85.769997 520.510010 93.848572
2015-07-01 536.150024 94.010002 625.609985 114.309998
2015-08-03 512.890015 89.430000 618.250000 115.029999
2015-09-01 511.890015 89.900002 608.419983 103.260002
2015-10-01 625.900024 101.970001 710.809998 108.379997
2015-11-02 664.799988 104.239998 742.599976 123.330002
2015-12-01 675.890015 104.660004 758.880005 114.379997
2016-01-04 587.000000 112.209999 742.950012 91.839996
2016-02-01 552.520020 106.919998 697.770020 93.410004
2016-03-01 593.640015 114.099998 744.950012 102.230003
2016-04-01 659.590027 117.580002 693.010010 90.029999
2016-05-02 722.789978 118.809998 735.719971 102.570000
2016-06-01 715.619995 114.279999 692.099976 91.480003
2016-07-01 758.809998 123.940002 768.789978 91.250000
2016-08-01 769.159973 126.120003 767.049988 97.449997
2016-09-01 837.309998 128.270004 777.289978 98.550003
2016-10-03 789.820007 130.990005 784.539978 124.870003
2016-11-01 750.570007 118.419998 758.039978 117.000000
2016-12-01 749.869995 115.050003 771.820007 123.800003
2017-01-03 823.479980 130.320007 796.789978 140.710007
2017-02-01 807.640015 132.059998 801.340027 140.970001
[57 rows x 4 columns]
面板轴:
In [22]: p.axes
Out[22]:
[Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object'),
DatetimeIndex(['2012-06-01', '2012-07-02', '2012-08-01', '2012-09-04', '2012-10-01', '2012-11-01', '2012-12-03', '2013-01-02', '2013-02-01'
, '2013-03-01', '2013-04-01', '2013-05-01', '2013-06-03',
'2013-07-01', '2013-08-01', '2013-09-03', '2013-10-01', '2013-11-01', '2013-12-02', '2014-01-02', '2014-02-03', '2014-03-03'
, '2014-04-01', '2014-05-01', '2014-06-02', '2014-07-01',
'2014-08-01', '2014-09-02', '2014-10-01', '2014-11-03', '2014-12-01', '2015-01-02', '2015-02-02', '2015-03-02', '2015-04-01'
, '2015-05-01', '2015-06-01', '2015-07-01', '2015-08-03',
'2015-09-01', '2015-10-01', '2015-11-02', '2015-12-01', '2016-01-04', '2016-02-01', '2016-03-01', '2016-04-01', '2016-05-02'
, '2016-06-01', '2016-07-01', '2016-08-01', '2016-09-01',
'2016-10-03', '2016-11-01', '2016-12-01', '2017-01-03', '2017-02-01'],
dtype='datetime64[ns]', name='Date', freq=None),
Index(['AMZN', 'FB', 'GOOG', 'NFLX'], dtype='object')]