使用 Full Outer Join 合并两个数据帧未在两个数据帧上获取 Key Column
Merging two data frames not getting Key Column on both dataframes with Full OuterJoin
我有两个如下所示的数据框。我使用 pandas 和 numpy 来比较差异。
df_a
Key Value
0 data_owner John
1 locationcode local
2 Unit sales
3 application autosales
df_b
Key Value
0 data_owner John
1 locationcode local
2 Unit sales
3 application autosales
4 department frontoffice
我正在使用以下代码进行合并
df = pd.merge(df_a,df_b,on=['Key'],how='outer',left_index=True,right_index=True)
df['diff'] = np.where((df['Value_x']==df['Value_y']), 'No', 'Yes')
我打算输出比较 df 和两侧的任何缺失项目应该输出。
下面的实际输出:但问题是我想显示两个数据帧中的键,但是如果你看到下面的输出它只显示一次,即我需要 Key_y 也是输出的一部分。
Key Value_x Value_y diff
0 data_owner John John No
1 locationcode local local No
2 unit sales sales No
3 application autosales autosales No
4 department frontoffice NaN No
预期输出:我想显示来自两个
的密钥
Key_x Value_x Key_y Value_y diff
0 data_owner John data_owner John No
1 locationcode local locationcode local No
2 unit sales unit sales No
3 application autosales application autosales No
4 department frontoffice NaN NaN Yes
使用,DataFrame.add_suffix
在合并之前将后缀添加到两个数据框的列中,这样它们的键在合并后就不会组合成单个列:
df = pd.merge(
df_b.add_suffix('_x'), df_a.add_suffix('_y'),
left_on='Key_x', right_on='Key_y', how='outer')
df['diff'] = np.where(df['Value_x'].eq(df['Value_y']), 'No', 'Yes')
# print(df)
Key_x Value_x Key_y Value_y diff
0 data_owner John data_owner John No
1 locationcode local locationcode local No
2 Unit sales Unit sales No
3 application autosales application autosales No
4 department frontoffice NaN NaN Yes
我有两个如下所示的数据框。我使用 pandas 和 numpy 来比较差异。
df_a
Key Value
0 data_owner John
1 locationcode local
2 Unit sales
3 application autosales
df_b
Key Value
0 data_owner John
1 locationcode local
2 Unit sales
3 application autosales
4 department frontoffice
我正在使用以下代码进行合并
df = pd.merge(df_a,df_b,on=['Key'],how='outer',left_index=True,right_index=True)
df['diff'] = np.where((df['Value_x']==df['Value_y']), 'No', 'Yes')
我打算输出比较 df 和两侧的任何缺失项目应该输出。
下面的实际输出:但问题是我想显示两个数据帧中的键,但是如果你看到下面的输出它只显示一次,即我需要 Key_y 也是输出的一部分。
Key Value_x Value_y diff
0 data_owner John John No
1 locationcode local local No
2 unit sales sales No
3 application autosales autosales No
4 department frontoffice NaN No
预期输出:我想显示来自两个
的密钥 Key_x Value_x Key_y Value_y diff
0 data_owner John data_owner John No
1 locationcode local locationcode local No
2 unit sales unit sales No
3 application autosales application autosales No
4 department frontoffice NaN NaN Yes
使用,DataFrame.add_suffix
在合并之前将后缀添加到两个数据框的列中,这样它们的键在合并后就不会组合成单个列:
df = pd.merge(
df_b.add_suffix('_x'), df_a.add_suffix('_y'),
left_on='Key_x', right_on='Key_y', how='outer')
df['diff'] = np.where(df['Value_x'].eq(df['Value_y']), 'No', 'Yes')
# print(df)
Key_x Value_x Key_y Value_y diff
0 data_owner John data_owner John No
1 locationcode local locationcode local No
2 Unit sales Unit sales No
3 application autosales application autosales No
4 department frontoffice NaN NaN Yes