Pandas:每年生成两个日期

Pandas: Generate Two Dates Per Year

这里是新手。我正在尝试制定一项政策,每年创建两个记录,围绕他们的周年纪念日分开。我的输入将是每项政策的单行,并将包括发布日期和结束日期。我想扩展它,为每项政策创建两个记录,围绕他们的周年日分开。我的输出应该是这样的:

policy_key issue_date end_date period_start period_end
12345 2005-03-15 2020-10-18 2005-03-15 2005-12-31
12345 2005-03-15 2020-10-18 2006-01-01 2006-03-14
12345 2005-03-15 2020-10-18 2006-03-15 2006-12-31
... ... ... ... ...
12345 2005-03-15 2020-10-18 2020-01-01 2020-03-14
12345 2005-03-15 2020-10-18 2020-03-15 2020-10-18

这是我到目前为止尝试过的方法,但我不知道如何将 period_start 日期更新为从年初或周年纪念日开始,以及 period_end 在周年纪念日的前一天或一年的最后一天结束的日期。

# importing libraries
import pandas as pd
import datetime as dt
import numpy as np
import math as math
from datetime import date, timedelta

# arbitrary policy inputs
policy_key = '12345'
issue_date = '2005-03-15'
issue_age = 35
duration = 10
end_date = '2020-10-18'
freq = pd.DateOffset(months=6)

# creating dataframe
df = pd.DataFrame({'policy_key':policy_key,
                    'issue_date':issue_date,
                    'issue_age':issue_age,
                    'duration':duration,
                    'attained_age':issue_age + duration - 1,
                    'period_start':pd.date_range(start=issue_date,end=end_date, freq=freq)})

df['issue_date'] =  pd.to_datetime(df['issue_date'], format='%Y-%m-%d')```

有点复杂,但是很管用!

# importing libraries
import pandas as pd
import datetime as dt
import numpy as np
import math as math
from datetime import date, timedelta

# arbitrary policy inputs
policy_key = '12345'
issue_date = '2005-03-15'
issue_age = 35
duration = 10
end_date = '2020-10-18'
freq = pd.DateOffset(months=6)

将日期设置为 datetime:

issue_date = pd.to_datetime(issue_date, format="%Y-%m-%d")
end_date = pd.to_datetime(end_date, format="%Y-%m-%d")

为每个日期范围创建 3 个列表:

开始日期:

  1. 给定的issue_date
  2. 每年年初的日期列表,偏移 issue_date.month-1issue_date.day-1
  3. 每年年底的日期列表

与结束日期相似,但相差 issue_date.day-2

# start dates
start_dates = [issue_date] + \
    list(pd.date_range(issue_date,
                       end_date,
                       freq=pd.offsets.YearBegin()) +\
         pd.offsets.DateOffset(months=issue_date.month-1,
                               days=issue_date.day-1)) + \
    list(pd.date_range(issue_date,
                       end_date,
                       freq=pd.offsets.YearBegin()))

# end dates

end_dates = [end_date] + \
    list(pd.date_range(issue_date,
                   end_date,
                   freq=pd.offsets.YearEnd())) + \
    list(pd.date_range(issue_date,
                       end_date,
                       freq=pd.offsets.YearBegin()) +\
         pd.offsets.DateOffset(months=issue_date.month-1,
                               days=issue_date.day-2))

对值进行排序,使它们按顺序排列:

# sort values
start_dates.sort()
end_dates.sort()

将列添加到 DataFrame:

df = pd.DataFrame({'policy_key': policy_key,
                   'issue_date': issue_date,
                   'end_date': end_date,
                   'issue_age': issue_age,
                   'duration': duration,
                   'attained_age': issue_age + duration - 1,
                   'period_start': start_dates,
                   'period_end': end_dates})
#    policy_key issue_date   end_date  issue_age  duration  attained_age period_start period_end
# 0       12345 2005-03-15 2020-10-18         35        10            44   2005-03-15 2005-12-31
# 1       12345 2005-03-15 2020-10-18         35        10            44   2006-01-01 2006-03-14
# 2       12345 2005-03-15 2020-10-18         35        10            44   2006-03-15 2006-12-31
# 3       12345 2005-03-15 2020-10-18         35        10            44   2007-01-01 2007-03-14
# 4       12345 2005-03-15 2020-10-18         35        10            44   2007-03-15 2007-12-31
# 5       12345 2005-03-15 2020-10-18         35        10            44   2008-01-01 2008-03-14
# 6       12345 2005-03-15 2020-10-18         35        10            44   2008-03-15 2008-12-31
# 7       12345 2005-03-15 2020-10-18         35        10            44   2009-01-01 2009-03-14
# 8       12345 2005-03-15 2020-10-18         35        10            44   2009-03-15 2009-12-31
# 9       12345 2005-03-15 2020-10-18         35        10            44   2010-01-01 2010-03-14
# 10      12345 2005-03-15 2020-10-18         35        10            44   2010-03-15 2010-12-31
# 11      12345 2005-03-15 2020-10-18         35        10            44   2011-01-01 2011-03-14
# 12      12345 2005-03-15 2020-10-18         35        10            44   2011-03-15 2011-12-31
# 13      12345 2005-03-15 2020-10-18         35        10            44   2012-01-01 2012-03-14
# 14      12345 2005-03-15 2020-10-18         35        10            44   2012-03-15 2012-12-31
# 15      12345 2005-03-15 2020-10-18         35        10            44   2013-01-01 2013-03-14
# 16      12345 2005-03-15 2020-10-18         35        10            44   2013-03-15 2013-12-31
# 17      12345 2005-03-15 2020-10-18         35        10            44   2014-01-01 2014-03-14
# 18      12345 2005-03-15 2020-10-18         35        10            44   2014-03-15 2014-12-31
# 19      12345 2005-03-15 2020-10-18         35        10            44   2015-01-01 2015-03-14
# 20      12345 2005-03-15 2020-10-18         35        10            44   2015-03-15 2015-12-31
# 21      12345 2005-03-15 2020-10-18         35        10            44   2016-01-01 2016-03-14
# 22      12345 2005-03-15 2020-10-18         35        10            44   2016-03-15 2016-12-31
# 23      12345 2005-03-15 2020-10-18         35        10            44   2017-01-01 2017-03-14
# 24      12345 2005-03-15 2020-10-18         35        10            44   2017-03-15 2017-12-31
# 25      12345 2005-03-15 2020-10-18         35        10            44   2018-01-01 2018-03-14
# 26      12345 2005-03-15 2020-10-18         35        10            44   2018-03-15 2018-12-31
# 27      12345 2005-03-15 2020-10-18         35        10            44   2019-01-01 2019-03-14
# 28      12345 2005-03-15 2020-10-18         35        10            44   2019-03-15 2019-12-31
# 29      12345 2005-03-15 2020-10-18         35        10            44   2020-01-01 2020-03-14
# 30      12345 2005-03-15 2020-10-18         35        10            44   2020-03-15 2020-10-18