根据行合并两个数据框
Merge two dataframes based on rows
我知道有很多关于合并两个 Pandas df 的资源,但我正在尝试根据第二个 df 的 ID 合并一个 df 但我需要从行中创建新列第二个df。这有点令人困惑,但我在这里有一个例子可以阐明我正在尝试做的事情。
我有:
dfa = pd.DataFrame({"ID": ["1", "2", "3"],"Color":["Red", "White", "Blue"],"Length":["16", "14.97", "22.75"]})
dfb = pd.DataFrame({"ID": ["1", "1", "2","3"],"Col1":["Color", "Width", "Length","Color"],"Value":["Blue", "14.97", "22.75","Green"]})
我想要的:
dfc = pd.DataFrame({"ID": ["1", "2", "3"],"Color":["Blue", "White", "Green"],"Length":["16", "14.97", "22.75"],"c:Color":["Blue","NaN","Green"],"c:Width":["14.97","NaN","NaN"],"c:Length":["NaN","22.75","NaN"]})
如有任何帮助,我们将不胜感激!
在merge
之前使用pivot
:
>>> dfa.merge(dfb.pivot('ID', 'Col1', 'Value').add_prefix('c:'), on='ID')
ID Color Length c:Color c:Length c:Width
0 1 Red 16 Blue NaN 14.97
1 2 White 14.97 NaN 22.75 NaN
2 3 Blue 22.75 Green NaN NaN
要获得 'exactly' 你的输出:
>>> dfa.merge(dfb.pivot('ID', 'Col1', 'Value')[dfb['Col1'].unique()].add_prefix('c:'), on='ID')
ID Color Length c:Color c:Width c:Length
0 1 Red 16 Blue 14.97 NaN
1 2 White 14.97 NaN NaN 22.75
2 3 Blue 22.75 Green NaN NaN
加入前需要转换为宽屏:
dfa.merge(
dfb.pivot(
index='ID',
columns='Col1',
values='Value'
).add_prefix('c:'),
on = 'ID'
)