当两个数据框的列名匹配时查找值
Look up values when the columns names of two dataframes are a match
我想编写一个函数,当 df1 和 df2 的列名相互匹配时更新 df1 的值。
例如:
df1:
Name | Graduated | Employed | Married
AAA 1 2 3
BBB 0 1 2
CCC 1 0 1
df2:
Answer_Code | Graduated | Employed | Married
0 No No No
1 Yes Intern Engaged
2 N/A PT Yes
3 N/A FT Divorced
最终结果:
df3:
Name | Graduated | Employed | Married
AAA Yes PT Divorced
BBB No Intern Yes
CCC Yes No NO
我想编写这样的代码:
IF d1.columns = d2.columns THEN
df1.column.update(df1.column.map(df2.set_index('Answer_Code').column))
您可以使用 map
.
示例:
df1.Graduated.map(df2.Graduated)
产量
0 Yes
1 No
2 Yes
因此对每一列都这样做,如下所示
for col in df1.columns:
if col in df2.columns:
df1[col] = df1[col].map(df2[col])
如有必要,请记住先将索引设置为答案代码,即 df2 = df2.set_index("Answer_Code")
。
一种方法是利用pd.DataFrame.lookup
:
df1 = pd.DataFrame({'Name': ['AAA', 'BBB', 'CCC'],
'Graduated': [1, 0, 1],
'Employed': [2, 1, 0],
'Married': [3, 2, 1]})
df2 = pd.DataFrame({'Answer_Code': [0, 1, 2, 3],
'Graduated': ['No', 'Yes', np.nan, np.nan],
'Employed': ['No', 'Intern', 'PT', 'FT'],
'Married': ['No', 'Engaged', 'Yes', 'Divorced']})
# perform lookup on df2 using row & column labels from df1
arr = df2.set_index('Answer_Code')\
.lookup(df1.iloc[:, 1:].values.flatten(),
df1.columns[1:].tolist()*3)\
.reshape(3, -1)
# copy df1 and allocate values from arr
df3 = df1.copy()
df3.iloc[:, 1:] = arr
print(df3)
Name Graduated Employed Married
0 AAA Yes PT Divorced
1 BBB No Intern Yes
2 CCC Yes No Engaged
我想编写一个函数,当 df1 和 df2 的列名相互匹配时更新 df1 的值。
例如: df1:
Name | Graduated | Employed | Married
AAA 1 2 3
BBB 0 1 2
CCC 1 0 1
df2:
Answer_Code | Graduated | Employed | Married
0 No No No
1 Yes Intern Engaged
2 N/A PT Yes
3 N/A FT Divorced
最终结果: df3:
Name | Graduated | Employed | Married
AAA Yes PT Divorced
BBB No Intern Yes
CCC Yes No NO
我想编写这样的代码:
IF d1.columns = d2.columns THEN
df1.column.update(df1.column.map(df2.set_index('Answer_Code').column))
您可以使用 map
.
示例:
df1.Graduated.map(df2.Graduated)
产量
0 Yes
1 No
2 Yes
因此对每一列都这样做,如下所示
for col in df1.columns:
if col in df2.columns:
df1[col] = df1[col].map(df2[col])
如有必要,请记住先将索引设置为答案代码,即 df2 = df2.set_index("Answer_Code")
。
一种方法是利用pd.DataFrame.lookup
:
df1 = pd.DataFrame({'Name': ['AAA', 'BBB', 'CCC'],
'Graduated': [1, 0, 1],
'Employed': [2, 1, 0],
'Married': [3, 2, 1]})
df2 = pd.DataFrame({'Answer_Code': [0, 1, 2, 3],
'Graduated': ['No', 'Yes', np.nan, np.nan],
'Employed': ['No', 'Intern', 'PT', 'FT'],
'Married': ['No', 'Engaged', 'Yes', 'Divorced']})
# perform lookup on df2 using row & column labels from df1
arr = df2.set_index('Answer_Code')\
.lookup(df1.iloc[:, 1:].values.flatten(),
df1.columns[1:].tolist()*3)\
.reshape(3, -1)
# copy df1 and allocate values from arr
df3 = df1.copy()
df3.iloc[:, 1:] = arr
print(df3)
Name Graduated Employed Married
0 AAA Yes PT Divorced
1 BBB No Intern Yes
2 CCC Yes No Engaged