在 python 中使用三个数据帧重新编码
Recode using three dataframes in python
我有三个独立的 DataFrame:
import pandas as pd
df1 = pd.DataFrame({ "Log": ["1114","1115","1116","1117","1118","1119","120"], "Gender": ["2","2","2","1","1","1","2"] })
df2 = pd.DataFrame({"NAME": ["Gender"],"SOURCE": ["MALE_FEMALE_LIST"]})
df3 = pd.DataFrame({"ID":["0", "1", "2"], "MALE_FEMALE_LIST":["Select", "Male","Female"]})
df3.set_index("ID", inplace = True)
df1 是我要根据来自 df3 的信息重新编码的数据的位置。我想说的是,如果 df1 中的列 header 与 df2 中的名称相同,请查看 df2 中的 SOURCE 并将 df3 信息应用于该列。
尝试:
for _, row in df2.iterrows():
df1[row["NAME"]] = df1[row["NAME"]].map(df3[row["SOURCE"]])
print(df1)
打印:
Log Gender
0 1114 Female
1 1115 Female
2 1116 Female
3 1117 Male
4 1118 Male
5 1119 Male
6 120 Female
我有三个独立的 DataFrame:
import pandas as pd
df1 = pd.DataFrame({ "Log": ["1114","1115","1116","1117","1118","1119","120"], "Gender": ["2","2","2","1","1","1","2"] })
df2 = pd.DataFrame({"NAME": ["Gender"],"SOURCE": ["MALE_FEMALE_LIST"]})
df3 = pd.DataFrame({"ID":["0", "1", "2"], "MALE_FEMALE_LIST":["Select", "Male","Female"]})
df3.set_index("ID", inplace = True)
df1 是我要根据来自 df3 的信息重新编码的数据的位置。我想说的是,如果 df1 中的列 header 与 df2 中的名称相同,请查看 df2 中的 SOURCE 并将 df3 信息应用于该列。
尝试:
for _, row in df2.iterrows():
df1[row["NAME"]] = df1[row["NAME"]].map(df3[row["SOURCE"]])
print(df1)
打印:
Log Gender
0 1114 Female
1 1115 Female
2 1116 Female
3 1117 Male
4 1118 Male
5 1119 Male
6 120 Female