Python - 如何在 3 列上合并两个数据框并保留两个数据框中的列?
Python - How can I combine two data frames on 3 columns and keep columns from both dataframes?
我有两个数据框,我想将它们一个一个地添加到另一个之上,由 3 列连接,但同时保留两个数据框中的列。
两个数据框是:
data_top = [{'Date': '15/06/2021', 'Code_top': 'a', 'ID_top': 1, 'Portfolio_top':100, 'Currency': 'EUR', 'Country': 'France', 'Sector': 'Finance', 'Name':'Bradley', 'Classification': 'xyz', 'Data_Type':0, 'Value': 3000000.5,'Weight': 0.05, 'Floor': 'Flag'},
{'Date': '15/06/2021', 'Code_top': 'b', 'ID_top': 2, 'Portfolio_top':200, 'Currency': 'EUR', 'Country': 'Germany', 'Sector': 'Real Estate', 'Name':'ApartmentsInc.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 2000000.5,'Weight': 0.02, 'Floor': 'Flag'}]
data_bottom = [{'Code_bottom': 'a', 'ID_bottom': 1, 'Portfolio_bottom':100, 'Price': 151.9, 'Delta': -1000},
{'Code_bottom': 'b', 'ID_bottom': 2, 'Portfolio_bottom':200, 'Price': 25.5, 'Delta': 1000}]
data_top = pd.DataFrame(data_top)
data_bottom = pd.DataFrame(data_bottom)
最终结果应该是这样的:'
data_combined = [{'Date': '15/06/2021', 'Code_top': 'a', 'ID_top': 1, 'Portfolio_top':100, 'Currency': 'EUR', 'Country': 'France', 'Sector': 'Finance', 'Name':'Bradley', 'Classification': 'xyz', 'Data_Type':0, 'Value': 3000000.5,'Weight': 0.05, 'Floor': 'Flag'},
{'Date': '15/06/2021', 'Code_top': 'b', 'ID_top': 2, 'Portfolio_top':200, 'Currency': 'EUR', 'Country': 'Germany', 'Sector': 'Real Estate', 'Name':'ApartmentsInc.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 2000000.5,'Weight': 0.02, 'Floor': 'Flag'},
{'Date': '15/06/2021', 'Code_top': 'a', 'ID_top': 1, 'Portfolio_top':100, 'Currency': 'EUR', 'Country': 'France', 'Sector': 'Finance', 'Name':'Bradley.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 3000000.5,'Weight': 0.05, 'Floor': 'Flag', 'Price':151.9, 'Delta':-1000},
{'Date': '15/06/2021', 'Code_top': 'b', 'ID_top': 2, 'Portfolio_top':200, 'Currency': 'EUR', 'Country': 'Germany', 'Sector': 'Real Estate', 'Name':'ApartmentsInc.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 2000000.5,'Weight': 0.02, 'Floor': 'Flag', 'Price': 25.5, 'Delta': 1000},
]
data_top = pd.DataFrame(data_top)
the two data frames and the final result
我做了一些尝试,但没有成功。谁能帮我解决这个问题?提前致谢!
希望我没看错你的问题:
x = data_top.merge(
data_bottom,
left_on=["Code_top", "ID_top", "Portfolio_top"],
right_on=["Code_bottom", "ID_bottom", "Portfolio_bottom"],
)
out = pd.concat([data_top, x[data_top.columns.tolist() + ["Price", "Delta"]]])
print(out)
打印:
Date Code_top ID_top Portfolio_top Currency Country Sector Name Classification Data_Type Value Weight Floor Price Delta
0 15/06/2021 a 1 100 EUR France Finance Bradley xyz 0 3000000.5 0.05 Flag NaN NaN
1 15/06/2021 b 2 200 EUR Germany Real Estate ApartmentsInc. xyz 0 2000000.5 0.02 Flag NaN NaN
0 15/06/2021 a 1 100 EUR France Finance Bradley xyz 0 3000000.5 0.05 Flag 151.9 -1000.0
1 15/06/2021 b 2 200 EUR Germany Real Estate ApartmentsInc. xyz 0 2000000.5 0.02 Flag 25.5 1000.0
我有两个数据框,我想将它们一个一个地添加到另一个之上,由 3 列连接,但同时保留两个数据框中的列。
两个数据框是:
data_top = [{'Date': '15/06/2021', 'Code_top': 'a', 'ID_top': 1, 'Portfolio_top':100, 'Currency': 'EUR', 'Country': 'France', 'Sector': 'Finance', 'Name':'Bradley', 'Classification': 'xyz', 'Data_Type':0, 'Value': 3000000.5,'Weight': 0.05, 'Floor': 'Flag'},
{'Date': '15/06/2021', 'Code_top': 'b', 'ID_top': 2, 'Portfolio_top':200, 'Currency': 'EUR', 'Country': 'Germany', 'Sector': 'Real Estate', 'Name':'ApartmentsInc.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 2000000.5,'Weight': 0.02, 'Floor': 'Flag'}]
data_bottom = [{'Code_bottom': 'a', 'ID_bottom': 1, 'Portfolio_bottom':100, 'Price': 151.9, 'Delta': -1000},
{'Code_bottom': 'b', 'ID_bottom': 2, 'Portfolio_bottom':200, 'Price': 25.5, 'Delta': 1000}]
data_top = pd.DataFrame(data_top)
data_bottom = pd.DataFrame(data_bottom)
最终结果应该是这样的:'
data_combined = [{'Date': '15/06/2021', 'Code_top': 'a', 'ID_top': 1, 'Portfolio_top':100, 'Currency': 'EUR', 'Country': 'France', 'Sector': 'Finance', 'Name':'Bradley', 'Classification': 'xyz', 'Data_Type':0, 'Value': 3000000.5,'Weight': 0.05, 'Floor': 'Flag'},
{'Date': '15/06/2021', 'Code_top': 'b', 'ID_top': 2, 'Portfolio_top':200, 'Currency': 'EUR', 'Country': 'Germany', 'Sector': 'Real Estate', 'Name':'ApartmentsInc.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 2000000.5,'Weight': 0.02, 'Floor': 'Flag'},
{'Date': '15/06/2021', 'Code_top': 'a', 'ID_top': 1, 'Portfolio_top':100, 'Currency': 'EUR', 'Country': 'France', 'Sector': 'Finance', 'Name':'Bradley.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 3000000.5,'Weight': 0.05, 'Floor': 'Flag', 'Price':151.9, 'Delta':-1000},
{'Date': '15/06/2021', 'Code_top': 'b', 'ID_top': 2, 'Portfolio_top':200, 'Currency': 'EUR', 'Country': 'Germany', 'Sector': 'Real Estate', 'Name':'ApartmentsInc.', 'Classification': 'xyz', 'Data_Type':0, 'Value': 2000000.5,'Weight': 0.02, 'Floor': 'Flag', 'Price': 25.5, 'Delta': 1000},
]
data_top = pd.DataFrame(data_top)
the two data frames and the final result
我做了一些尝试,但没有成功。谁能帮我解决这个问题?提前致谢!
希望我没看错你的问题:
x = data_top.merge(
data_bottom,
left_on=["Code_top", "ID_top", "Portfolio_top"],
right_on=["Code_bottom", "ID_bottom", "Portfolio_bottom"],
)
out = pd.concat([data_top, x[data_top.columns.tolist() + ["Price", "Delta"]]])
print(out)
打印:
Date Code_top ID_top Portfolio_top Currency Country Sector Name Classification Data_Type Value Weight Floor Price Delta
0 15/06/2021 a 1 100 EUR France Finance Bradley xyz 0 3000000.5 0.05 Flag NaN NaN
1 15/06/2021 b 2 200 EUR Germany Real Estate ApartmentsInc. xyz 0 2000000.5 0.02 Flag NaN NaN
0 15/06/2021 a 1 100 EUR France Finance Bradley xyz 0 3000000.5 0.05 Flag 151.9 -1000.0
1 15/06/2021 b 2 200 EUR Germany Real Estate ApartmentsInc. xyz 0 2000000.5 0.02 Flag 25.5 1000.0