如何使用 pandas 在同一组中加入两个多对多的数据帧?
How to join two dataframes with many-to-many in the same group using pandas?
我想将两个dataframes合并为一个,但是key里面有一些重复的值,也就是Item。它不能通过使用 'cross join' 来实现,因为它只在同一组中使用“交叉连接”。有人可以分享解决问题的想法吗?谢谢
例如:
dataframe1:
ID Item Price
1 apple 5
1 banana 3
1 lemon 2
2 apple 7
2 banana 4
2 lemon 4
dataframe2
Item state
apple TX
apple CA
apple NJ
banana CA
lemon NY
lemon PA
预期结果:
ID Item Price State
1 apple 5 TX
1 apple 5 NJ
1 apple 5 CA
1 banana 3 CA
1 lemon 2 NY
1 lemon 2 PA
2 apple 7 TX
2 apple 7 NJ
2 apple 7 CA
2 banana 4 CA
2 lemon 4 NY
2 lemon 4 PA
你可以这样做:
pd.merge(df1, df2).sort_values(by=['ID'])
输出:
ID Item Price state
0 1 apple 5 TX
1 1 apple 5 CA
2 1 apple 5 NJ
6 1 banana 3 CA
8 1 lemon 2 NY
9 1 lemon 2 PA
3 2 apple 7 TX
4 2 apple 7 CA
5 2 apple 7 NJ
7 2 banana 4 CA
10 2 lemon 4 NY
11 2 lemon 4 PA
我想将两个dataframes合并为一个,但是key里面有一些重复的值,也就是Item。它不能通过使用 'cross join' 来实现,因为它只在同一组中使用“交叉连接”。有人可以分享解决问题的想法吗?谢谢
例如:
dataframe1:
ID Item Price
1 apple 5
1 banana 3
1 lemon 2
2 apple 7
2 banana 4
2 lemon 4
dataframe2
Item state
apple TX
apple CA
apple NJ
banana CA
lemon NY
lemon PA
预期结果:
ID Item Price State
1 apple 5 TX
1 apple 5 NJ
1 apple 5 CA
1 banana 3 CA
1 lemon 2 NY
1 lemon 2 PA
2 apple 7 TX
2 apple 7 NJ
2 apple 7 CA
2 banana 4 CA
2 lemon 4 NY
2 lemon 4 PA
你可以这样做:
pd.merge(df1, df2).sort_values(by=['ID'])
输出:
ID Item Price state
0 1 apple 5 TX
1 1 apple 5 CA
2 1 apple 5 NJ
6 1 banana 3 CA
8 1 lemon 2 NY
9 1 lemon 2 PA
3 2 apple 7 TX
4 2 apple 7 CA
5 2 apple 7 NJ
7 2 banana 4 CA
10 2 lemon 4 NY
11 2 lemon 4 PA