评估下一个生日是哪个人
Assessing which person is the one with the next birthday
在 Stata 中,我试图评估与给定日期相比,给定生日中的哪一个是下一个生日。我的数据如下所示:
- 所有日期均为每日格式 (%dD_m_Y),例如1926 年 3 月 18 日
- 变量
date
,它是所有其他日期应该与之比较的参考日期
- 变量
birth1, birth2, birth3, birth4, birth5, birth6
包含所有可能的家庭成员的生日。
例如:一个家庭有两个成年人A和B。A的生日是1977年11月20日,B的生日是1978年3月30日。参考日期是29.11.2020。我想知道谁是下一个生日的人,在上面的例子中是B,因为A在参考日期前一周过生日,所以这个家庭的下一个生日将在30日庆祝2021 年 3 月。
示例数据:
date
birth1
birth2
birth3
birth4
birth5
birth6
02feb2021
15jan1974
27nov1985
30nov2020
31aug1945
27jun1999
07apr1997
19nov2020
27sep1993
30dec1996
29jan2021
29mar1973
05dec2020
21jan1976
02oct1976
21jan1976
25may1995
15feb1997
25nov2020
25nov1943
29nov1946
02feb2021
28apr1979
已编辑以说明 2 月 29 日
*在 date
不是闰年的情况下,编辑会将 2 月 29 日生日的人视为 3 月 1 日。如果这对您的特定用例没有意义,您可以根据需要轻松更改下面的代码。
由于您想要当年的下一个生日而不是最接近的生日,您可以使用 date
的年份和 birth{i}
的月份和日期来为每个人的下一个生日创建一个日期生日。然后你可以简单地从每个家庭中取最早的值。我重塑long,生成一个人和一个家庭id
就是为了做这个
制作示例数据
clear
set obs 6
set seed 1996
generate date = floor((mdy(12,31,2020)-mdy(12,1,2015)+1)*runiform() + mdy(12,1,2015))
format date %td
forvalue i = 1/6 {
gen birth`i' = floor((mdy(12,31,1996)-mdy(12,1,1980)+1)*runiform() + mdy(12,1,1980)) if _n < `i' == 0
format birth`i' %td
}
replace birth6 = birth4 in 6 // want a tie
replace birth2 = date("29feb1996","DMY") in 3 // Feb 29
查找下一个生日
gen household_id = _n
reshape long birth, i(date household_id) j(person)
drop if mi(birth)
gen person_next_birthday = mdy( month(birth), day(birth), year(date))
* TREAT FEB 29 as if they have a march 1 birthday in non-leap years
replace person_next_birthday = mdy(3,1,year(date)) if month(birth) == 2 ///
& day(birth) == 29 & mod(year(date),4)!=0
replace person_next_birthday = mdy( month(birth), day(birth), year(date) + 1) if person_next_birthday < date
replace person_next_birthday = mdy(3,1,year(date)+1) if month(birth) == 2 ///
& day(birth) == 29 & mod(year(date) + 1,4)!=0 & person_next_birthday < date
format person_next_birthday %td
bysort household_id (person_next_birthday): gen next_bday = person_next_birthday[1]
format next_bday %td
drop person_next_birthday
reshape wide birth, i(date household_id next_bday) j(person)
gen next_bday_persons = ""
* Make a string to present household persons who have next bday
foreach v of varlist birth* {
local person = subinstr("`v'","birth","",.)
local condition = "month(`v') == month(next_bday) & day(`v') == day(next_bday)"
local condition_feb29 = "month(next_bday) == 3 & day(next_bday) == 1 & month(`v') == 2 & day(`v') == 29"
replace next_bday_persons = next_bday_persons + "|`person'" if `condition' | `condition_feb29'
}
replace next_bday_persons = regexr(next_bday_persons,"^\|","")
order next_bday_persons, after(next_bday)
最后一个循环是不必要的,但说明这对关系很稳健。
在 Stata 中,我试图评估与给定日期相比,给定生日中的哪一个是下一个生日。我的数据如下所示:
- 所有日期均为每日格式 (%dD_m_Y),例如1926 年 3 月 18 日
- 变量
date
,它是所有其他日期应该与之比较的参考日期 - 变量
birth1, birth2, birth3, birth4, birth5, birth6
包含所有可能的家庭成员的生日。
例如:一个家庭有两个成年人A和B。A的生日是1977年11月20日,B的生日是1978年3月30日。参考日期是29.11.2020。我想知道谁是下一个生日的人,在上面的例子中是B,因为A在参考日期前一周过生日,所以这个家庭的下一个生日将在30日庆祝2021 年 3 月。
示例数据:
date | birth1 | birth2 | birth3 | birth4 | birth5 | birth6 |
---|---|---|---|---|---|---|
02feb2021 | 15jan1974 | 27nov1985 | ||||
30nov2020 | 31aug1945 | 27jun1999 | 07apr1997 | |||
19nov2020 | 27sep1993 | 30dec1996 | ||||
29jan2021 | 29mar1973 | |||||
05dec2020 | 21jan1976 | 02oct1976 | 21jan1976 | 25may1995 | 15feb1997 | |
25nov2020 | 25nov1943 | 29nov1946 | ||||
02feb2021 | 28apr1979 |
已编辑以说明 2 月 29 日
*在 date
不是闰年的情况下,编辑会将 2 月 29 日生日的人视为 3 月 1 日。如果这对您的特定用例没有意义,您可以根据需要轻松更改下面的代码。
由于您想要当年的下一个生日而不是最接近的生日,您可以使用 date
的年份和 birth{i}
的月份和日期来为每个人的下一个生日创建一个日期生日。然后你可以简单地从每个家庭中取最早的值。我重塑long,生成一个人和一个家庭id
就是为了做这个
制作示例数据
clear
set obs 6
set seed 1996
generate date = floor((mdy(12,31,2020)-mdy(12,1,2015)+1)*runiform() + mdy(12,1,2015))
format date %td
forvalue i = 1/6 {
gen birth`i' = floor((mdy(12,31,1996)-mdy(12,1,1980)+1)*runiform() + mdy(12,1,1980)) if _n < `i' == 0
format birth`i' %td
}
replace birth6 = birth4 in 6 // want a tie
replace birth2 = date("29feb1996","DMY") in 3 // Feb 29
查找下一个生日
gen household_id = _n
reshape long birth, i(date household_id) j(person)
drop if mi(birth)
gen person_next_birthday = mdy( month(birth), day(birth), year(date))
* TREAT FEB 29 as if they have a march 1 birthday in non-leap years
replace person_next_birthday = mdy(3,1,year(date)) if month(birth) == 2 ///
& day(birth) == 29 & mod(year(date),4)!=0
replace person_next_birthday = mdy( month(birth), day(birth), year(date) + 1) if person_next_birthday < date
replace person_next_birthday = mdy(3,1,year(date)+1) if month(birth) == 2 ///
& day(birth) == 29 & mod(year(date) + 1,4)!=0 & person_next_birthday < date
format person_next_birthday %td
bysort household_id (person_next_birthday): gen next_bday = person_next_birthday[1]
format next_bday %td
drop person_next_birthday
reshape wide birth, i(date household_id next_bday) j(person)
gen next_bday_persons = ""
* Make a string to present household persons who have next bday
foreach v of varlist birth* {
local person = subinstr("`v'","birth","",.)
local condition = "month(`v') == month(next_bday) & day(`v') == day(next_bday)"
local condition_feb29 = "month(next_bday) == 3 & day(next_bday) == 1 & month(`v') == 2 & day(`v') == 29"
replace next_bday_persons = next_bday_persons + "|`person'" if `condition' | `condition_feb29'
}
replace next_bday_persons = regexr(next_bday_persons,"^\|","")
order next_bday_persons, after(next_bday)
最后一个循环是不必要的,但说明这对关系很稳健。