在 svydesign 中使用 BRFSS 数据嵌套年份
Nesting for Year with BRFSS Data in svydesign
我正在努力将多年的国家 BRFSS 数据合并为一组,并将适当的复杂调查设计与 survey
包结合起来,以便能够计算不确定性。我看过几个关于如何在一年内执行此操作的示例,并且知道在执行多年时我需要嵌套 year
,但我不太确定如何包含它。这是我使用变量 finalwt
:
重新加权后的数据
> glimpse(df)
Rows: 1,756,594
Columns: 15
$ year <dbl> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016~
$ age <dbl> 43, 59, 80, 70, 18, 65, 74, 76, 43, 56, 75, 61, 57, 58, 70, 62, 65, 52, 37, 80, 36, 34, 80, 66, 72, ~
$ sex <fct> Male, Female, Female, Male, Male, Female, Female, Female, Female, Male, Female, Female, Male, Female~
$ race <fct> White, White, White, White, White, White, White, White, White, White, White, White, White, Other, Wh~
$ insured <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, ~
$ met_mam <dbl> NA, 1, NA, NA, NA, 1, 1, NA, NA, NA, NA, 0, NA, 1, 0, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, N~
$ met_pap <dbl> NA, 1, NA, NA, NA, 0, NA, NA, 1, NA, NA, 0, NA, 1, NA, 1, 1, 1, 0, NA, NA, 1, NA, NA, NA, NA, 0, 1, ~
$ met_crc <dbl> NA, 1, NA, 1, NA, 1, 1, NA, NA, 1, 1, 0, 1, 1, 1, 1, 1, 0, NA, NA, NA, NA, NA, 1, 1, NA, 1, NA, 1, 1~
$ met_lcs <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
$ psu <dbl> 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2~
$ ststr <dbl> 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11~
$ llcpwt <dbl> 767.8446, 329.6599, 290.7493, 211.0392, 1582.5398, 540.0474, 323.2903, 522.1566, 973.1335, 205.5001,~
$ finalwt1618 <dbl> 395.39989, 169.75764, 149.72072, 108.67418, 814.92545, 278.09623, 166.47765, 268.88341, 501.11298, 1~
$ finalwt1719 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
$ finalwt <dbl> 203.69399, 87.45225, 77.13005, 55.98453, 419.81655, 143.26390, 85.76254, 138.51783, 258.15309, 54.51~
这是我的 svydesign()
代码:
#build in complex survey design
options(survey.lonely.psu = "adjust")
des <- svydesign(ids= ~1, strata= ~ststr, weights= ~finalwt, nest=T, data = df)
请指教。谢谢。
您想将年份视为额外的层次。它们是分层的,因为每年抽样的 PSU 数量是预先固定的。所以,正如@Anthony 在评论中所说
brfss_design <- svydesign( id = ~ psu , strata = ~ interaction( ststr , year ) , data = df , weight = ~ finalwt , nest = TRUE )
其中finalwt
需要原始权重除以年数,所以它加起来是美国人口的一份(而不是年数的份数)
我正在努力将多年的国家 BRFSS 数据合并为一组,并将适当的复杂调查设计与 survey
包结合起来,以便能够计算不确定性。我看过几个关于如何在一年内执行此操作的示例,并且知道在执行多年时我需要嵌套 year
,但我不太确定如何包含它。这是我使用变量 finalwt
:
> glimpse(df)
Rows: 1,756,594
Columns: 15
$ year <dbl> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016~
$ age <dbl> 43, 59, 80, 70, 18, 65, 74, 76, 43, 56, 75, 61, 57, 58, 70, 62, 65, 52, 37, 80, 36, 34, 80, 66, 72, ~
$ sex <fct> Male, Female, Female, Male, Male, Female, Female, Female, Female, Male, Female, Female, Male, Female~
$ race <fct> White, White, White, White, White, White, White, White, White, White, White, White, White, Other, Wh~
$ insured <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, ~
$ met_mam <dbl> NA, 1, NA, NA, NA, 1, 1, NA, NA, NA, NA, 0, NA, 1, 0, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, N~
$ met_pap <dbl> NA, 1, NA, NA, NA, 0, NA, NA, 1, NA, NA, 0, NA, 1, NA, 1, 1, 1, 0, NA, NA, 1, NA, NA, NA, NA, 0, 1, ~
$ met_crc <dbl> NA, 1, NA, 1, NA, 1, 1, NA, NA, 1, 1, 0, 1, 1, 1, 1, 1, 0, NA, NA, NA, NA, NA, 1, 1, NA, 1, NA, 1, 1~
$ met_lcs <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
$ psu <dbl> 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2.016e+09, 2~
$ ststr <dbl> 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11011, 11~
$ llcpwt <dbl> 767.8446, 329.6599, 290.7493, 211.0392, 1582.5398, 540.0474, 323.2903, 522.1566, 973.1335, 205.5001,~
$ finalwt1618 <dbl> 395.39989, 169.75764, 149.72072, 108.67418, 814.92545, 278.09623, 166.47765, 268.88341, 501.11298, 1~
$ finalwt1719 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
$ finalwt <dbl> 203.69399, 87.45225, 77.13005, 55.98453, 419.81655, 143.26390, 85.76254, 138.51783, 258.15309, 54.51~
这是我的 svydesign()
代码:
#build in complex survey design
options(survey.lonely.psu = "adjust")
des <- svydesign(ids= ~1, strata= ~ststr, weights= ~finalwt, nest=T, data = df)
请指教。谢谢。
您想将年份视为额外的层次。它们是分层的,因为每年抽样的 PSU 数量是预先固定的。所以,正如@Anthony 在评论中所说
brfss_design <- svydesign( id = ~ psu , strata = ~ interaction( ststr , year ) , data = df , weight = ~ finalwt , nest = TRUE )
其中finalwt
需要原始权重除以年数,所以它加起来是美国人口的一份(而不是年数的份数)