使用相似的列合并 2 个数据框

Merging 2 dataframe using similar columns

我列出了 2 个数据框,如下所示

df

 Type       Breed     Common Color  Other Color  Behaviour
 Golden      Big           Gold          White        Fun      
 Corgi      Small          Brown         White       Crazy
 Bulldog    Medium         Black         Grey        Strong

df2

 Type              Breed    Behaviour   Bark Sound
 Pug               Small      Sleepy          Ak
 German Shepard    Big        Cool            Woof
 Puddle            Small      Aggressive      Ek

我想按 TypeBreedBehavior 列合并 2 个数据框。

因此,我希望输出为:

Type           Breed      Behavior
Golden          Big         Fun
Corgi           Small       Crazy  
Bulldog         Medium      Strong
Pug             Small       Sleepy
German Shepard  Big         Cool
Puddle          Small       Aggressive

你需要concat:

print (pd.concat([df1[['Type','Breed','Behaviour']], 
                  df2[['Type','Breed','Behaviour']]], ignore_index=True))

             Type   Breed   Behaviour
0          Golden     Big         Fun
1           Corgi   Small       Crazy
2         Bulldog  Medium      Strong
3             Pug   Small      Sleepy
4  German Shepard     Big        Cool
5          Puddle   Small  Aggressive

更一般的做法是对 DataFrames:

的列使用 intersection
cols = df1.columns.intersection(df2.columns)
print (cols)
Index(['Type', 'Breed', 'Behaviour'], dtype='object')

print (pd.concat([df1[cols], df2[cols]], ignore_index=True))
             Type   Breed   Behaviour
0          Golden     Big         Fun
1           Corgi   Small       Crazy
2         Bulldog  Medium      Strong
3             Pug   Small      Sleepy
4  German Shepard     Big        Cool
5          Puddle   Small  Aggressive

更通用,如果 df1df2 没有 NaN 值,使用 dropna 删除带有 NaN 的列:

print (pd.concat([df1 ,df2], ignore_index=True))
     Bark Sound   Behaviour   Breed Common Color Other Color            Type
0        NaN         Fun     Big         Gold       White          Golden
1        NaN       Crazy   Small        Brown       White           Corgi
2        NaN      Strong  Medium        Black        Grey         Bulldog
3         Ak      Sleepy   Small          NaN         NaN             Pug
4       Woof        Cool     Big          NaN         NaN  German Shepard
5         Ek  Aggressive   Small          NaN         NaN          Puddle               


print (pd.concat([df1 ,df2], ignore_index=True).dropna(1))
    Behaviour   Breed            Type
0         Fun     Big          Golden
1       Crazy   Small           Corgi
2      Strong  Medium         Bulldog
3      Sleepy   Small             Pug
4        Cool     Big  German Shepard
5  Aggressive   Small          Puddle

使用 join 删除不重叠的列

df1.T.join(df2.T, lsuffix='_').dropna().T.reset_index(drop=True)