如何在 pandas 中跨数据帧划分值?

How do I divide values across dataframes in pandas?

所以我有一个原始数据集: original_data_set

我从csv文件中读入,然后按字段分开:

像这样,loan_df = re_df.loc[re_df.field == 'loan_amount'] home_df = re_df.loc[re_df.field == 'home_value']

产生 loans home_vals

我想在两个数据帧上划分值字段,但是当我尝试时,ltv_df = loan_df['value']/home_df['value'] ,我得到了一系列 NaN 值。

有人有什么建议吗?

两个选项:

如果只需要 values numpy 除法工作:

ltv_df = loan_df['value'].values / home_df['value'].values
[0.57238284 1.30293486]

或者如果需要 DataFrame,请使用 set_index, divide then reset_index 返回 DataFrame:

ltv_df = (
        loan_df.set_index('loan_id')['value'] /
        home_df.set_index('loan_id')['value']
).reset_index(name='result')
   loan_id    result
0        1  0.572383
1        2  1.302935

或者,可以通过 apply and np.divide:

直接从初始 DataFrame 获取值
ltv_df = (
    re_df.groupby('loan_id')['value'].apply(lambda x: np.divide(*x))
        .reset_index(name='result')
)
   loan_id    result
0        1  0.572383
1        2  1.302935

数据帧设置:

import numpy as np
import pandas as pd

re_df = pd.DataFrame({'loan_id': [1, 1, 2, 2],
                      'field': ['loan_amount', 'home_value'] * 2,
                      'value': [65037, 113625, 84395, 64773]})

loan_df = re_df.loc[re_df.field == 'loan_amount']
home_df = re_df.loc[re_df.field == 'home_value']