外部连接表 - 保留描述

Outer Join Tables - Keep Descriptions

new = pd.DataFrame({'table': \['a','b', 'c', 'd'\], 'desc': \['','','',''\], 'total':\[22,22,22,22\]})
old = pd.DataFrame({'table': \['a','b', 'e'\], 'desc': \['foo','foo','foo'\], 'total':\[11,11,11\]})

all = pd.merge(new, old, how='outer', on=\['table', 'total'\])

输出:

table desc_x  total desc_y
0     a            22    NaN
1     b            22    NaN
2     c            22    NaN
3     d            22    NaN
4     a    NaN     11    foo

期望的输出:

table desc  total
0     a   foo     22
1     b   foo     22
2     c           22
3     d           22
4     a   foo     11

我尝试了外连接,但它删除了 a 和 b 的描述。`

  • 考虑到您要实现的目标是在 tabletotal 上进行外部联接,这毫无意义。在 table
  • 上更改为外部联接 然后可以修改
  • table 以使用您想要的输出和清理列中隐含的首选项
new = pd.DataFrame({'table': ['a','b', 'c', 'd'], 'desc': ['','','',''], 'total':[22,22,22,22]})
old = pd.DataFrame({'table': ['a','b', 'e'], 'desc': ['foo','foo','foo'], 'total':[11,11,11]})

all = pd.merge(new, old, how='outer', on=['table'])

# select prefered columns
all["desc"] = all["desc_x"].replace('', np.nan).fillna(all["desc_y"]).fillna("")
all["total"] = all["total_x"].fillna(all["total_y"])

# clean up columns
all = all.drop(columns=[c for c in all.columns if c[-2:] in ["_x", "_y"]])

all
table desc total
0 a foo 22
1 b foo 22
2 c 22
3 d 22
4 e foo 11