Pandas 中的付款时间表模拟
Payment schedule simulation in Pandas
我正在尝试根据开始日期和结束日期制定付款时间表,但我卡住了。
下面是我的示例代码:
import pandas as pd
from datetime import datetime as dt
import numpy as np
from dateutil.relativedelta import relativedelta
from tqdm import tqdm
tqdm.pandas()
df = pd.DataFrame({
'REF_NO': ['VN1211001'],
'From': ['2022-3-31'],
'To': ['2024-4-12'],
'Frequency':[6],
'Amount':[600000]})
df['From'] = pd.to_datetime(df['From'])
df['To'] = pd.to_datetime(df['To'])
df['Date Diff'] = (df['To'] - df['From']).dt.days
df['Month Diff'] = ((df['To'] - df['From'])/np.timedelta64(1,'M'))
df['Repayment Times'] = np.ceil(df['Month Diff']/df['Frequency'])
df['Repayment Amount'] = df['Amount']/df['Repayment Times']
def add_months(start_date,delta_period):
end_date = start_date + relativedelta(months=delta_period)
return end_date
df['Next Payment'] = df.progress_apply(lambda row: add_months(row['From'],row['Frequency']),axis=1)
这是我得到的:
我要的是下一笔付款全部显示出来。例如:2023-03-31
、2023-09-30
...
有什么方法可以让这个或者sample和它一样,我可以学习吗?
谢谢。
准备一个新的空数据框并为数据框的每一行添加处理。从结果行中获取开始日期和结束日期以及付款间隔的列表。不需要第一个开始日期,因此在从第二个付款日期开始的循环过程中添加行。
new_df = pd.DataFrame()
for idx, row in df.iterrows():
pay_list = pd.date_range(row['From'], row['To'], freq=str(row['Frequency'])+'M')
if row['To'] > pay_list[-1]:
pay_list = pay_list.append(pd.date_range(row['To'], row['To'], periods=1))
for p in pay_list[1:]:
row['Next Payment'] = p
new_df = new_df.append(row.T, ignore_index=True)
new_df
REF_NO From To Frequency Amount Date Diff Month Diff Repayment Times Repayment Amount Next Payment
0 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2022-09-30
1 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2023-03-31
2 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2023-09-30
3 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2024-03-31
4 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2024-04-12
我正在尝试根据开始日期和结束日期制定付款时间表,但我卡住了。
下面是我的示例代码:
import pandas as pd
from datetime import datetime as dt
import numpy as np
from dateutil.relativedelta import relativedelta
from tqdm import tqdm
tqdm.pandas()
df = pd.DataFrame({
'REF_NO': ['VN1211001'],
'From': ['2022-3-31'],
'To': ['2024-4-12'],
'Frequency':[6],
'Amount':[600000]})
df['From'] = pd.to_datetime(df['From'])
df['To'] = pd.to_datetime(df['To'])
df['Date Diff'] = (df['To'] - df['From']).dt.days
df['Month Diff'] = ((df['To'] - df['From'])/np.timedelta64(1,'M'))
df['Repayment Times'] = np.ceil(df['Month Diff']/df['Frequency'])
df['Repayment Amount'] = df['Amount']/df['Repayment Times']
def add_months(start_date,delta_period):
end_date = start_date + relativedelta(months=delta_period)
return end_date
df['Next Payment'] = df.progress_apply(lambda row: add_months(row['From'],row['Frequency']),axis=1)
这是我得到的:
我要的是下一笔付款全部显示出来。例如:2023-03-31
、2023-09-30
...
有什么方法可以让这个或者sample和它一样,我可以学习吗?
谢谢。
准备一个新的空数据框并为数据框的每一行添加处理。从结果行中获取开始日期和结束日期以及付款间隔的列表。不需要第一个开始日期,因此在从第二个付款日期开始的循环过程中添加行。
new_df = pd.DataFrame()
for idx, row in df.iterrows():
pay_list = pd.date_range(row['From'], row['To'], freq=str(row['Frequency'])+'M')
if row['To'] > pay_list[-1]:
pay_list = pay_list.append(pd.date_range(row['To'], row['To'], periods=1))
for p in pay_list[1:]:
row['Next Payment'] = p
new_df = new_df.append(row.T, ignore_index=True)
new_df
REF_NO From To Frequency Amount Date Diff Month Diff Repayment Times Repayment Amount Next Payment
0 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2022-09-30
1 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2023-03-31
2 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2023-09-30
3 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2024-03-31
4 VN1211001 2022-03-31 2024-04-12 6.0 600000.0 743.0 24.411179 5.0 120000.0 2024-04-12