Concat multiindex pandas into single one
Concat multiindex pandas into single one
我有 3 个 pandas 多索引 groupby(['location','date'])
print(a)
location date hosp
976 2020-10-02 9
2020-10-03 10
2020-10-04 10
print(b)
incid_hosp
location date
976 2020-10-02 1
2020-10-03 1
2020-10-04 0
print(c)
P T
location date
978 2020-10-02 5 60
2020-10-02 4 52
2020-10-03 4 2
我想将它们连接起来得到:
print(result)
hosp incid_hosp P T
location date
976 2020-10-02 9 1 NaN NaN
2020-10-03 10 1 NaN NaN
2020-10-04 10 0 NaN NaN
978 2020-10-02 NaN NaN 5 60
2020-10-03 NaN NaN 4 52
2020-10-04 NaN NaN 4 2
我试过了
result = pd.concat([a,b,c], axis=1, sort=False)
但是它产生了很多 NaN 值...
尝试使用 combine_first
和 reduce
:
from functools import reduce
reduce(lambda x, y: x.combine_first(y), [a,b,c])
输出:
P T hosp incid_hosp
location date
976 2020-10-02 NaN NaN 9.0 1.0
2020-10-03 NaN NaN 10.0 1.0
2020-10-04 NaN NaN 10.0 0.0
978 2020-10-02 5.0 60.0 NaN NaN
2020-10-02 4.0 52.0 NaN NaN
2020-10-03 4.0 2.0 NaN NaN
对于三个数据帧,你可以使用链join
:
a.join(b,how='outer').join(c, how='outer')
输出:
hosp incid_hosp P T
location date
976 2020-10-02 9.0 1.0 NaN NaN
2020-10-03 10.0 1.0 NaN NaN
2020-10-04 10.0 0.0 NaN NaN
978 2020-10-02 NaN NaN 5.0 60.0
2020-10-02 NaN NaN 4.0 52.0
2020-10-03 NaN NaN 4.0 2.0
我有 3 个 pandas 多索引 groupby(['location','date'])
print(a)
location date hosp
976 2020-10-02 9
2020-10-03 10
2020-10-04 10
print(b)
incid_hosp
location date
976 2020-10-02 1
2020-10-03 1
2020-10-04 0
print(c)
P T
location date
978 2020-10-02 5 60
2020-10-02 4 52
2020-10-03 4 2
我想将它们连接起来得到:
print(result)
hosp incid_hosp P T
location date
976 2020-10-02 9 1 NaN NaN
2020-10-03 10 1 NaN NaN
2020-10-04 10 0 NaN NaN
978 2020-10-02 NaN NaN 5 60
2020-10-03 NaN NaN 4 52
2020-10-04 NaN NaN 4 2
我试过了
result = pd.concat([a,b,c], axis=1, sort=False)
但是它产生了很多 NaN 值...
尝试使用 combine_first
和 reduce
:
from functools import reduce
reduce(lambda x, y: x.combine_first(y), [a,b,c])
输出:
P T hosp incid_hosp
location date
976 2020-10-02 NaN NaN 9.0 1.0
2020-10-03 NaN NaN 10.0 1.0
2020-10-04 NaN NaN 10.0 0.0
978 2020-10-02 5.0 60.0 NaN NaN
2020-10-02 4.0 52.0 NaN NaN
2020-10-03 4.0 2.0 NaN NaN
对于三个数据帧,你可以使用链join
:
a.join(b,how='outer').join(c, how='outer')
输出:
hosp incid_hosp P T
location date
976 2020-10-02 9.0 1.0 NaN NaN
2020-10-03 10.0 1.0 NaN NaN
2020-10-04 10.0 0.0 NaN NaN
978 2020-10-02 NaN NaN 5.0 60.0
2020-10-02 NaN NaN 4.0 52.0
2020-10-03 NaN NaN 4.0 2.0