即使所有数据集的轴都相同,Concat Columns 也会产生 NAN

Concat Columns produces NAN even though axis is the same for all datasets

我正在尝试连接来自多个数据帧的列。

`AUD = OHLC_AUDUSD['bid']['close'];`
`AUD = AUD.dropna()`
`CAD = OHLC_USDCAD['bid']['close'];`
`CAD = CAD.dropna()`

`print AUD`

symbol timestamp AUDUSD 2015-01-05 0.8096 2015-01-06 0.8077 2015-01-07 0.8074 2015-01-08 0.8112 2015-01-09 0.8200 Name: close, dtype: float64

`print CAD`

symbol timestamp USDCAD 2015-01-05 1.1756 2015-01-06 1.1838 2015-01-07 1.1818 2015-01-08 1.1826 2015-01-09 1.1864 Name: close, dtype: float64

`key=['AUD','CAD']`
`marketData = pd.concat([AUD,CAD], axis=1, keys=key)`

                      AUD     CAD
symbol timestamp                 
AUDUSD 2015-01-05  0.8096     NaN
       2015-01-06  0.8077     NaN
       2015-01-07  0.8074     NaN
       2015-01-08  0.8112     NaN
       2015-01-09  0.8200     NaN
USDCAD 2015-01-05     NaN  1.1756
       2015-01-06     NaN  1.1838
       2015-01-07     NaN  1.1818
       2015-01-08     NaN  1.1826
       2015-01-09     NaN  1.1864

我想看的是

              AUD     CAD
timestamp                 
2015-01-05  0.8096  1.1756
2015-01-06  0.8077  1.1838
2015-01-07  0.8074  1.1818
2015-01-08  0.8112  1.1826
2015-01-09  0.8200  1.1864

我还没弄明白!?

问题是您使用的是 MultiIndex。所以第一个对应的元素并不都具有索引 2015-01-05,一个具有索引 (AUDUSD, 2015-01-05),另一个具有索引 (USDCAD, 2015-01-05)。这些被认为是不同的索引。您需要使用 droplevel 删除索引的 symbol 部分,只留下 timestamp.

或者,您可以 unstack concat 之前输入的 symbol 索引以获取 symbol 作为列名。然后,在 concat 之后,您可以用 ['AUD', 'CAD'].

覆盖列名