通过替换计算两个数据框中两列的减法

Calculate the subtraction of two columns in two dataframes by replacement

我有 df1 其中包含:

IDs    values
E21    32
DD12   82
K99    222

df2 包含:

IDs   values
GU1   87
K99   93
E21   48

我需要的是检查df2中的ID是否存在于df1中,做df1value的减法-df2 用于 ID 并更新 df2 中的 value

如果 df2IDdf1 中不存在,IDdf2 中的值保持不变。

所以,上面例子的结果(基本上df2会被更新):

IDs    values
GU1    87 #the same not changed since it not exist in df1
K99    129 #exist in df1, do the subtraction 222-93=129
E21    -16 #exist in df1, do the subtraction 32-48=129

有什么帮助吗?

下面是使用 pd.merge 的方法:

# merge the data frames
dfx = pd.merge(df2, df1, on='IDs', how='left', suffixes=('','_2'))

# modify new columns
dfx['val'] = dfx['values_2'] - dfx['values']
dfx['val'] = dfx['val'].combine_first(dfx['values'])
dfx = dfx[['IDs','val']].rename(columns={'val':'values'})

print(dfx)

   IDs  values
0  GU1    87.0
1  K99   129.0
2  E21   -16.0

IIUC:

d = df1.set_index('IDs')['values']
i = df2.itertuples(index=False)
df2.assign(values=[d[x] - v if x in d.index else v for x, v in i])

   IDs  values
0  GU1      87
1  K99     129
2  E21     -16

完全相同的想法,但使用 dict 而不是 pandas.Series

d = dict(zip(df1['IDs'], df1['values']))
i = df2.itertuples(index=False)
df2.assign(values=[d[x] - v if x in d else v for x, v in i])

您可以使用方法 update:

df2.update(df1 - df2)

输出:

     values
IDs        
GU1    87.0
K99   129.0
E21   -16.0
# create new column in df2 with name 'new'
df2['new'] = df2['values']
# loop on the values of 'IDs' column
for i, element in enumerate(df2.IDs):
    # condition to check if an element exists in df1  
    if element in df1.IDs.values:
        df2['new'][i]  = df1['values'][df1.index[df1.IDs == element][0]] - df2['values'][i]
# drop the old values column 
df2.drop('values', axis = 1, inplace= True)
# rename the new values column
df2.rename(columns={'new': 'values'}, inplace= True)