即使所有数据集的轴都相同,Concat Columns 也会产生 NAN
Concat Columns produces NAN even though axis is the same for all datasets
我正在尝试连接来自多个数据帧的列。
`AUD = OHLC_AUDUSD['bid']['close'];`
`AUD = AUD.dropna()`
`CAD = OHLC_USDCAD['bid']['close'];`
`CAD = CAD.dropna()`
`print AUD`
symbol timestamp AUDUSD 2015-01-05 0.8096
2015-01-06 0.8077
2015-01-07 0.8074
2015-01-08 0.8112
2015-01-09 0.8200 Name: close, dtype: float64
`print CAD`
symbol timestamp USDCAD 2015-01-05 1.1756
2015-01-06 1.1838
2015-01-07 1.1818
2015-01-08 1.1826
2015-01-09 1.1864 Name: close, dtype: float64
`key=['AUD','CAD']`
`marketData = pd.concat([AUD,CAD], axis=1, keys=key)`
AUD CAD
symbol timestamp
AUDUSD 2015-01-05 0.8096 NaN
2015-01-06 0.8077 NaN
2015-01-07 0.8074 NaN
2015-01-08 0.8112 NaN
2015-01-09 0.8200 NaN
USDCAD 2015-01-05 NaN 1.1756
2015-01-06 NaN 1.1838
2015-01-07 NaN 1.1818
2015-01-08 NaN 1.1826
2015-01-09 NaN 1.1864
我想看的是
AUD CAD
timestamp
2015-01-05 0.8096 1.1756
2015-01-06 0.8077 1.1838
2015-01-07 0.8074 1.1818
2015-01-08 0.8112 1.1826
2015-01-09 0.8200 1.1864
我还没弄明白!?
问题是您使用的是 MultiIndex
。所以第一个对应的元素并不都具有索引 2015-01-05
,一个具有索引 (AUDUSD, 2015-01-05)
,另一个具有索引 (USDCAD, 2015-01-05)
。这些被认为是不同的索引。您需要使用 droplevel
删除索引的 symbol
部分,只留下 timestamp
.
或者,您可以 unstack
concat
之前输入的 symbol
索引以获取 symbol
作为列名。然后,在 concat
之后,您可以用 ['AUD', 'CAD']
.
覆盖列名
我正在尝试连接来自多个数据帧的列。
`AUD = OHLC_AUDUSD['bid']['close'];`
`AUD = AUD.dropna()`
`CAD = OHLC_USDCAD['bid']['close'];`
`CAD = CAD.dropna()`
`print AUD`
symbol timestamp AUDUSD 2015-01-05 0.8096 2015-01-06 0.8077 2015-01-07 0.8074 2015-01-08 0.8112 2015-01-09 0.8200 Name: close, dtype: float64
`print CAD`
symbol timestamp USDCAD 2015-01-05 1.1756 2015-01-06 1.1838 2015-01-07 1.1818 2015-01-08 1.1826 2015-01-09 1.1864 Name: close, dtype: float64
`key=['AUD','CAD']`
`marketData = pd.concat([AUD,CAD], axis=1, keys=key)`
AUD CAD
symbol timestamp
AUDUSD 2015-01-05 0.8096 NaN
2015-01-06 0.8077 NaN
2015-01-07 0.8074 NaN
2015-01-08 0.8112 NaN
2015-01-09 0.8200 NaN
USDCAD 2015-01-05 NaN 1.1756
2015-01-06 NaN 1.1838
2015-01-07 NaN 1.1818
2015-01-08 NaN 1.1826
2015-01-09 NaN 1.1864
我想看的是
AUD CAD
timestamp
2015-01-05 0.8096 1.1756
2015-01-06 0.8077 1.1838
2015-01-07 0.8074 1.1818
2015-01-08 0.8112 1.1826
2015-01-09 0.8200 1.1864
我还没弄明白!?
问题是您使用的是 MultiIndex
。所以第一个对应的元素并不都具有索引 2015-01-05
,一个具有索引 (AUDUSD, 2015-01-05)
,另一个具有索引 (USDCAD, 2015-01-05)
。这些被认为是不同的索引。您需要使用 droplevel
删除索引的 symbol
部分,只留下 timestamp
.
或者,您可以 unstack
concat
之前输入的 symbol
索引以获取 symbol
作为列名。然后,在 concat
之后,您可以用 ['AUD', 'CAD']
.