显示到目前为止记录的另一个变量随着时间的推移达到的最高值的变量
variable showing the highest value attained of another variable, recorded so far, over time
我有一个患者数据集及其与酒精相关的患者数据随时间(以年为单位)的数据,如下所示
clear
input long patid float(year cohort)
1051 1994 1
2051 1972 1
2051 1989 2
2051 1990 2
2051 2000 2
2051 2001 3
2051 2002 1
2051 2003 2
8051 1995 1
8051 1996 1
8051 2003 1
end
label values cohort cohortlab
label define cohortlab 0 "general population" 1 "no alcohol data" 2 "indeterminate" 3 "non-drinker" 4 "low_risk" 5 "hazardous" 6 "AUD" , replace
我想创建一个变量来显示迄今为止在患者记录中的任何(年)点使用的最高酒精水平代码,这样数据集将如下所示:
clear
input long patid float(year cohort highestsofar)
1051 1994 1 1
2051 1972 1 1
2051 1989 2 2
2051 1990 2 2
2051 2000 2 2
2051 2001 3 3
2051 2002 1 3
2051 2003 2 3
8051 1995 1 1
8051 1996 1 1
8051 2003 1 1
end
label values cohort cohortlab
label values highestsofar cohortlab
label define cohortlab 0 "general population" 1 "no alcohol data" 2 "indeterminate" 3 "lifetime_abstainer" 4 "low_risk" 5 "hazardous" 6 "AUD" , replace
我想提供一个答案:
by patid: g highestsofar=cohort if cohort>cohort[_n-1]|_n==1
by patid: replace highestsofar=highestsofar[_n-1] if cohort<=cohort[_n-1]&_n>1
by patid: replace highestsofar=highestsofar[_n-1] if (highestsofar<highestsofar[_n-1]) & ((cohort>cohort[_n-1])&_n>1)
label values highestsofar cohortlab
如果能讨论更紧凑的语法,我会很高兴。
谢谢
感谢您提供清晰的示例和问题。
StataCorp 网站上的常见问题解答 link here 已解决该问题。这是使用 SSC 的 rangestat
的单行解决方案。
clear
input long patid float(year cohort)
1051 1994 1
2051 1972 1
2051 1989 2
2051 1990 2
2051 2000 2
2051 2001 3
2051 2002 1
2051 2003 2
8051 1995 1
8051 1996 1
8051 2003 1
end
label values cohort cohortlab
label define cohortlab 0 "general population" 1 "no alcohol data" 2 "indeterminate" 3 "non-drinker" 4 "low_risk" 5 "hazardous" 6 "AUD" , replace
rangestat (max) highestsofar = cohort, interval(year . 0) by(patid)
list, sepby(patid)
+-------------------------------------------+
| patid year cohort highes~r |
|-------------------------------------------|
1. | 1051 1994 no alcohol data 1 |
|-------------------------------------------|
2. | 2051 1972 no alcohol data 1 |
3. | 2051 1989 indeterminate 2 |
4. | 2051 1990 indeterminate 2 |
5. | 2051 2000 indeterminate 2 |
6. | 2051 2001 non-drinker 3 |
7. | 2051 2002 no alcohol data 3 |
8. | 2051 2003 indeterminate 3 |
|-------------------------------------------|
9. | 8051 1995 no alcohol data 1 |
10. | 8051 1996 no alcohol data 1 |
11. | 8051 2003 no alcohol data 1 |
+-------------------------------------------+
我有一个患者数据集及其与酒精相关的患者数据随时间(以年为单位)的数据,如下所示
clear
input long patid float(year cohort)
1051 1994 1
2051 1972 1
2051 1989 2
2051 1990 2
2051 2000 2
2051 2001 3
2051 2002 1
2051 2003 2
8051 1995 1
8051 1996 1
8051 2003 1
end
label values cohort cohortlab
label define cohortlab 0 "general population" 1 "no alcohol data" 2 "indeterminate" 3 "non-drinker" 4 "low_risk" 5 "hazardous" 6 "AUD" , replace
我想创建一个变量来显示迄今为止在患者记录中的任何(年)点使用的最高酒精水平代码,这样数据集将如下所示:
clear
input long patid float(year cohort highestsofar)
1051 1994 1 1
2051 1972 1 1
2051 1989 2 2
2051 1990 2 2
2051 2000 2 2
2051 2001 3 3
2051 2002 1 3
2051 2003 2 3
8051 1995 1 1
8051 1996 1 1
8051 2003 1 1
end
label values cohort cohortlab
label values highestsofar cohortlab
label define cohortlab 0 "general population" 1 "no alcohol data" 2 "indeterminate" 3 "lifetime_abstainer" 4 "low_risk" 5 "hazardous" 6 "AUD" , replace
我想提供一个答案:
by patid: g highestsofar=cohort if cohort>cohort[_n-1]|_n==1
by patid: replace highestsofar=highestsofar[_n-1] if cohort<=cohort[_n-1]&_n>1
by patid: replace highestsofar=highestsofar[_n-1] if (highestsofar<highestsofar[_n-1]) & ((cohort>cohort[_n-1])&_n>1)
label values highestsofar cohortlab
如果能讨论更紧凑的语法,我会很高兴。
谢谢
感谢您提供清晰的示例和问题。
StataCorp 网站上的常见问题解答 link here 已解决该问题。这是使用 SSC 的 rangestat
的单行解决方案。
clear
input long patid float(year cohort)
1051 1994 1
2051 1972 1
2051 1989 2
2051 1990 2
2051 2000 2
2051 2001 3
2051 2002 1
2051 2003 2
8051 1995 1
8051 1996 1
8051 2003 1
end
label values cohort cohortlab
label define cohortlab 0 "general population" 1 "no alcohol data" 2 "indeterminate" 3 "non-drinker" 4 "low_risk" 5 "hazardous" 6 "AUD" , replace
rangestat (max) highestsofar = cohort, interval(year . 0) by(patid)
list, sepby(patid)
+-------------------------------------------+
| patid year cohort highes~r |
|-------------------------------------------|
1. | 1051 1994 no alcohol data 1 |
|-------------------------------------------|
2. | 2051 1972 no alcohol data 1 |
3. | 2051 1989 indeterminate 2 |
4. | 2051 1990 indeterminate 2 |
5. | 2051 2000 indeterminate 2 |
6. | 2051 2001 non-drinker 3 |
7. | 2051 2002 no alcohol data 3 |
8. | 2051 2003 indeterminate 3 |
|-------------------------------------------|
9. | 8051 1995 no alcohol data 1 |
10. | 8051 1996 no alcohol data 1 |
11. | 8051 2003 no alcohol data 1 |
+-------------------------------------------+