合并两个 pandas 数据帧,每行多列

Merge two pandas dataframes with multiple columns per row

我有一个如下所示的数据框“df1”:

company id company name dealid_1 dealyear_1 dealid_2 dealyear_2
C1 ABC
C2 DEF

我想用另一个数据框“df2”中的数据填充空白单元格,如下所示:

deal id deal year company id company name
D1 2010 C1 ABC
D2 2015 C1 ABC
D3 2012 C2 DEF
D4 2017 C2 DEF

所以“df1”的最终结果应该如下:

company id company name dealid_1 dealyear_1 dealid_2 dealyear_2
C1 ABC D1 2010 D2 2015
C2 DEF D3 2012 D4 2017

谁能帮我解决这个问题?

谢谢!

使用GroupBy.cumcount for counter, pivoting by DataFrame.pivot with sorting second level of MultiIndex by DataFrame.sort_index,最后压平MultiIndex

df3 = (df2.assign(g = df2.groupby(['company id','company name']).cumcount())
         .pivot(index=['company id','company name'], columns='g')
         .sort_index(axis=1, level=1))
df3.columns = df3.columns.map(lambda x: f'{x[0]}_{x[1] + 1}')
print (df3.reset_index())
  company id company name deal id_1  deal year_1 deal id_2  deal year_2
0         C1          ABC        D1         2010        D2         2015
1         C2          DEF        D3         2012        D4         2017

要与第一个 df 合并,请使用:

df = df1[['company id', 'company name']].join(df3, on=['company id', 'company name'])

您可以使用:

df3 = (df2.drop(columns='company name')
          .assign(col=df2.groupby('company name').cumcount().add(1).astype(str))
          .pivot(index='company id', columns='col')
       )
df3.columns = df3.columns.map('_'.join)

out = df1[['company id', 'company name']].merge(df3, on='company id')

输出:

  company id company name deal id_1 deal id_2  deal year_1  deal year_2
0         C1          ABC        D1        D2         2010         2015
1         C2          DEF        D3        D4         2012         2017