使用相似的列合并 2 个数据框
Merging 2 dataframe using similar columns
我列出了 2 个数据框,如下所示
df
Type Breed Common Color Other Color Behaviour
Golden Big Gold White Fun
Corgi Small Brown White Crazy
Bulldog Medium Black Grey Strong
df2
Type Breed Behaviour Bark Sound
Pug Small Sleepy Ak
German Shepard Big Cool Woof
Puddle Small Aggressive Ek
我想按 Type
、Breed
和 Behavior
列合并 2 个数据框。
因此,我希望输出为:
Type Breed Behavior
Golden Big Fun
Corgi Small Crazy
Bulldog Medium Strong
Pug Small Sleepy
German Shepard Big Cool
Puddle Small Aggressive
你需要concat
:
print (pd.concat([df1[['Type','Breed','Behaviour']],
df2[['Type','Breed','Behaviour']]], ignore_index=True))
Type Breed Behaviour
0 Golden Big Fun
1 Corgi Small Crazy
2 Bulldog Medium Strong
3 Pug Small Sleepy
4 German Shepard Big Cool
5 Puddle Small Aggressive
更一般的做法是对 DataFrames
:
的列使用 intersection
cols = df1.columns.intersection(df2.columns)
print (cols)
Index(['Type', 'Breed', 'Behaviour'], dtype='object')
print (pd.concat([df1[cols], df2[cols]], ignore_index=True))
Type Breed Behaviour
0 Golden Big Fun
1 Corgi Small Crazy
2 Bulldog Medium Strong
3 Pug Small Sleepy
4 German Shepard Big Cool
5 Puddle Small Aggressive
更通用,如果 df1
和 df2
没有 NaN
值,使用 dropna
删除带有 NaN
的列:
print (pd.concat([df1 ,df2], ignore_index=True))
Bark Sound Behaviour Breed Common Color Other Color Type
0 NaN Fun Big Gold White Golden
1 NaN Crazy Small Brown White Corgi
2 NaN Strong Medium Black Grey Bulldog
3 Ak Sleepy Small NaN NaN Pug
4 Woof Cool Big NaN NaN German Shepard
5 Ek Aggressive Small NaN NaN Puddle
print (pd.concat([df1 ,df2], ignore_index=True).dropna(1))
Behaviour Breed Type
0 Fun Big Golden
1 Crazy Small Corgi
2 Strong Medium Bulldog
3 Sleepy Small Pug
4 Cool Big German Shepard
5 Aggressive Small Puddle
使用 join
删除不重叠的列
df1.T.join(df2.T, lsuffix='_').dropna().T.reset_index(drop=True)
我列出了 2 个数据框,如下所示
df
Type Breed Common Color Other Color Behaviour
Golden Big Gold White Fun
Corgi Small Brown White Crazy
Bulldog Medium Black Grey Strong
df2
Type Breed Behaviour Bark Sound
Pug Small Sleepy Ak
German Shepard Big Cool Woof
Puddle Small Aggressive Ek
我想按 Type
、Breed
和 Behavior
列合并 2 个数据框。
因此,我希望输出为:
Type Breed Behavior
Golden Big Fun
Corgi Small Crazy
Bulldog Medium Strong
Pug Small Sleepy
German Shepard Big Cool
Puddle Small Aggressive
你需要concat
:
print (pd.concat([df1[['Type','Breed','Behaviour']],
df2[['Type','Breed','Behaviour']]], ignore_index=True))
Type Breed Behaviour
0 Golden Big Fun
1 Corgi Small Crazy
2 Bulldog Medium Strong
3 Pug Small Sleepy
4 German Shepard Big Cool
5 Puddle Small Aggressive
更一般的做法是对 DataFrames
:
intersection
cols = df1.columns.intersection(df2.columns)
print (cols)
Index(['Type', 'Breed', 'Behaviour'], dtype='object')
print (pd.concat([df1[cols], df2[cols]], ignore_index=True))
Type Breed Behaviour
0 Golden Big Fun
1 Corgi Small Crazy
2 Bulldog Medium Strong
3 Pug Small Sleepy
4 German Shepard Big Cool
5 Puddle Small Aggressive
更通用,如果 df1
和 df2
没有 NaN
值,使用 dropna
删除带有 NaN
的列:
print (pd.concat([df1 ,df2], ignore_index=True))
Bark Sound Behaviour Breed Common Color Other Color Type
0 NaN Fun Big Gold White Golden
1 NaN Crazy Small Brown White Corgi
2 NaN Strong Medium Black Grey Bulldog
3 Ak Sleepy Small NaN NaN Pug
4 Woof Cool Big NaN NaN German Shepard
5 Ek Aggressive Small NaN NaN Puddle
print (pd.concat([df1 ,df2], ignore_index=True).dropna(1))
Behaviour Breed Type
0 Fun Big Golden
1 Crazy Small Corgi
2 Strong Medium Bulldog
3 Sleepy Small Pug
4 Cool Big German Shepard
5 Aggressive Small Puddle
使用 join
删除不重叠的列
df1.T.join(df2.T, lsuffix='_').dropna().T.reset_index(drop=True)