带日历年的样条模型

Spline model with calender year

我想知道死亡率(变量“mortality_rate”)是否随时间变化(变量“年”)。由于年份和mortality_rate之间的关系不是线性的(见图),我想运行一个样条模型,以年份为独立变量,mortality_rate为因变量。 运行 一年 20 节的样条模型如何?

我在 R 中有以下数据:

dat <- structure(list(Year = c(1998, 1999, 2000, 2001, 2002, 2003, 2004, 
2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 
2016, 2017, 2018), mortality_rate = c(0.0088, 0.0077, 0.0082, 
0.0075, 0.0076, 0.0075, 0.0066, 0.0061, 0.0059, 0.0054, 0.0054, 
0.0058, 0.0056, 0.006, 0.0053, 0.0061, 0.0052, 0.0055, 0.0069, 
0.0074, 0.0073)), row.names = c(NA, 21L), class = "data.frame")

二次多项式在视觉上很好地拟合了数据(见末尾的图表)并且所有系数都非常显着:

fm <- lm(mortality_rate ~ poly(Year, 2), dat)
plot(dat)
lines(fitted(fm) ~ Year, dat, col = "red")
summary(fm)

给予:

Call:
lm(formula = mortality_rate ~ poly(Year, 2), data = dat)

Residuals:
       Min         1Q     Median         3Q        Max 
-8.066e-04 -2.774e-04  1.149e-05  2.689e-04  7.702e-04 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.0065619  0.0001013  64.793  < 2e-16 ***
poly(Year, 2)1 -0.0024938  0.0004641  -5.373 4.17e-05 ***
poly(Year, 2)2  0.0036130  0.0004641   7.785 3.61e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0004641 on 18 degrees of freedom
Multiple R-squared:  0.8325,    Adjusted R-squared:  0.8139 
F-statistic: 44.74 on 2 and 18 DF,  p-value: 1.037e-07