使用 ageadjust.direct 计算每个子组的粗率和调整后的比率

Question

我正在尝试计算每年和每个年龄组的疾病发病率。我也想应用直接标准化。我正在使用函数 ageadjust.direct（包 epitools）。

   age_cat persondays_individual contactfirst_cat ESPpop personyears year 
 1 <40                     38624                3  26938      106.   2011 
 2 >90                       367                2    691      1.01 2011 
 3 41-50                  111208               10  14214      305.   2011 
 4 51-60                  222777               29  13534      610.   2011 
 5 61-70                  219567               41  11403      602.   2011 
 6 71-80                  102593               26  77484      281.   2011 
 7 81-90                   20056               10   3945       54.9  2011 
 8 <40                     32673                6  26938       89.5  2012 
 9 >90                       366                0    691      1.00 2012 
10 41-50                  102182               11  14214      280.   2012 
11 51-60                  209241               29  13534      573.   2012 
12 61-70                  224701               33  11403      616.   2012 
13 71-80                  104898               26  77484      287.   2012 
14 81-90                   23771                9   3945       65.1  2012 

df<-structure(list(age_cat = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("<40", ">90", "41-50", 
"51-60", "61-70", "71-80", "81-90"), class = "factor"), persondays_individual = c(38624, 
367, 111208, 222777, 219567, 102593, 20056, 32673, 366, 102182, 
209241, 224701, 104898, 23771), contactfirst_cat = c(3, 2, 10, 
29, 41, 26, 10, 6, 0, 11, 29, 33, 26, 9), ESPpop = c(26938, 691, 
14214, 13534, 11403, 77484, 3945, 26938, 691, 14214, 13534, 11403, 
77484, 3945), personyears = c(105.819178082192, 1.00547945205479, 
304.679452054795, 610.34794520548, 601.553424657534, 281.076712328767, 
54.9479452054794, 89.5150684931507, 1.0027397260274, 279.950684931507, 
573.26301369863, 615.619178082192, 287.391780821918, 65.1260273972603
), year = c("2011", "2011", "2011", "2011", "2011", "2011", "2011", 
"2012", "2012", "2012", "2012", "2012", "2012", "2012")), row.names = c(NA, 
14L), class = "data.frame")

我想计算粗发病率（$contactfirst_cat / 人年）、调整后的发病率以及每年和年龄组的 95% CI。

期望输出

   age_cat persondays_individual contactfirst_cat ESPpop personyears year    crude.rate  adj. rate   lci  uci
 1 <40                     38624                3  26938      106.   2011 
 2 >90                       367                2    691      1.01   2011 
 3 41-50                  111208               10  14214      305.   2011 
 4 51-60                  222777               29  13534      610.   2011 
 5 61-70                  219567               41  11403      602.   2011 
 6 71-80                  102593               26  77484      281.   2011 
 7 81-90                   20056               10   3945       54.9  2011 
 8 <40                     32673                6  26938       89.5  2012 
 9 >90                       366                0    691      1.00   2012 
10 41-50                  102182               11  14214      280.   2012

我尝试了以下代码（我从 Calculating crude and age-standardized rates, rate differences, and rate ratios with CIs by subgroups 获得）

df%>%
  group_by(year, age_cat) %>%
  summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat,   #count of events
                                               pop = personyears,          #person years of DFpop
                                               rate = NULL,                
                                               stdpop = ESPpop,            #standard population (European standard population per age_cat)
                                               conf.level = 0.95))) %>%
  mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
  ungroup()

然而，这段代码没有给我每个子组（年和 age_cat）的 crude/adj 率和 CI，但我只得到这 4 列的 1 个观察值。

如何为 crude.rate、adj.rate、lci 和 uci PER year 以及 age_cat 创建新列？任何帮助将不胜感激！非常感谢！

编辑当我运行 Quinten 的答案中的脚本时，我得到以下结果：

    crude.rate    adj.rate     lci          uci
1   0.06070314     0.07801597  0.06298263   0.09764104

Quinten 显示为 df_2 的输出正是我想要的。我不确定我做错了什么。

Answer 1

首先你需要unnest你的代码而不是像这样ungroup：

library(tidyverse)
library(epitools)
df_2 <- df %>%
  group_by(year, age_cat) %>%
  summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat,   #count of events
                                               pop = personyears,          #person years of DFpop
                                               rate = NULL,                
                                               stdpop = ESPpop,            #standard population (European standard population per age_cat)
                                               conf.level = 0.95))) %>%
  mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
  unnest(cols = c(age_adjust))

输出df_2:

# A tibble: 14 × 6
# Groups:   year [2]
   year  age_cat crude.rate adj.rate       lci    uci
   <chr> <fct>        <dbl>    <dbl>     <dbl>  <dbl>
 1 2011  <40         0.0284   0.0284   0.00585 0.0829
 2 2011  >90         1.99     1.99     0.241   7.19  
 3 2011  41-50       0.0328   0.0328   0.0157  0.0604
 4 2011  51-60       0.0475   0.0475   0.0318  0.0682
 5 2011  61-70       0.0682   0.0682   0.0489  0.0925
 6 2011  71-80       0.0925   0.0925   0.0604  0.136 
 7 2011  81-90       0.182    0.182    0.0873  0.335 
 8 2012  <40         0.0670   0.0670   0.0246  0.146 
 9 2012  >90         0        0      NaN       3.68  
10 2012  41-50       0.0393   0.0393   0.0196  0.0703
11 2012  51-60       0.0506   0.0506   0.0339  0.0727
12 2012  61-70       0.0536   0.0536   0.0369  0.0753
13 2012  71-80       0.0905   0.0905   0.0591  0.133 
14 2012  81-90       0.138    0.138    0.0632  0.262

在你可以像这样绑定你的两个数据帧以获得你想要的数据帧之后：

df_desired <- cbind(df, df_2[,c(3:6)]) 
df_desired

输出：

   age_cat persondays_individual contactfirst_cat ESPpop personyears year crude.rate   adj.rate         lci        uci
1      <40                 38624                3  26938  105.819178 2011 0.02835025 0.02835025 0.005846503 0.08285146
2      >90                   367                2    691    1.005479 2011 1.98910082 1.98910082 0.240889337 7.18531607
3    41-50                111208               10  14214  304.679452 2011 0.03282138 0.03282138 0.015739127 0.06035969
4    51-60                222777               29  13534  610.347945 2011 0.04751388 0.04751388 0.031820792 0.06823786
5    61-70                219567               41  11403  601.553425 2011 0.06815687 0.06815687 0.048910548 0.09246249
6    71-80                102593               26  77484  281.076712 2011 0.09250144 0.09250144 0.060425010 0.13553604
7    81-90                 20056               10   3945   54.947945 2011 0.18199043 0.18199043 0.087271484 0.33468687
8      <40                 32673                6  26938   89.515068 2012 0.06702782 0.06702782 0.024598029 0.14589135
9      >90                   366                0    691    1.002740 2012 0.00000000 0.00000000         NaN 3.67880055
10   41-50                102182               11  14214  279.950685 2012 0.03929263 0.03929263 0.019614742 0.07030538
11   51-60                209241               29  13534  573.263014 2012 0.05058760 0.05058760 0.033879310 0.07265223
12   61-70                224701               33  11403  615.619178 2012 0.05360457 0.05360457 0.036898918 0.07528074
13   71-80                104898               26  77484  287.391781 2012 0.09046884 0.09046884 0.059097248 0.13255781
14   81-90                 23771                9   3945   65.126027 2012 0.13819360 0.13819360 0.063190912 0.26233449

使用 ageadjust.direct 计算每个子组的粗率和调整后的比率

Calculate crude and ajusted rates per subgroup using ageadjust.direct

r

rate

adjustment

standardization