nlme::gls() 应用纵向分析第 2 版网站上的 R 代码需要一些调整

nlme::gls() R code on Applied Longitudinal Analysis, 2nd Edition website needs some tweaks

我的问题很简单。当我运行下面的代码,直接从“Applied Longitudinal Analysis”网站复制过来的,这里:https://content.sph.harvard.edu/fitzmaur/ala2e/ (第5章第5.7节),

library(foreign)
ds <- read.dta("tlc.dta")
ds$baseline <- ds$y0
tlclong <- reshape(ds, idvar="id", varying=c("y0","y1","y4","y6"),v.names="y", timevar="time", time=1:4, direction="long")
tlclong <- subset(tlclong, time > 1)
attach(tlclong)

week <- time
week[time==2] <- 1
week[time==3] <- 4
week[time==4] <- 6
time <- time - 1
week.f <- factor(week, c(1,4,6))
change <- y - baseline
cbaseline <- baseline - 26.406

library(nlme)
model <- gls(y ~ I(week.f==1) + I(week.f==4) + I(week.f==6) + I(week.f==1 & trt=="Succimer") + I(week.f==4 & trt=="Succimer") + I(week.f==6 & trt=="Succimer"), corr=corSymm(, form= ~ time | id), weights = varIdent(form = ~ 1 | week.f))

summary(model)

我得到的输出在几个非常关键的区域完全不同。

这是我得到的输出:

Generalized least squares fit by REML
  Model: y ~ I(week.f == 1) + I(week.f == 4) + I(week.f == 6) + I(week.f ==      1 & trt == "Succimer") + I(week.f == 4 & trt == "Succimer") +      I(week.f == 6 & trt == "Succimer") 
  Data: NULL 
       AIC      BIC    logLik
  2028.922 2076.764 -1001.461

Correlation Structure: General
 Formula: ~time | id 
 Parameter estimate(s):
 Correlation: 
  1 2
2 0  
3 0 0
Variance function:
 Structure: Different standard deviations per stratum
 Formula: ~1 | week.f 
 Parameter estimates:
1 4 6 
1 1 1 

Coefficients:
                                         Value Std.Error   t-value p-value
(Intercept)                             23.646  1.002955 23.576329  0.0000
I(week.f == 1)TRUE                       1.014  1.418393  0.714894  0.4752
I(week.f == 4)TRUE                       0.424  1.418393  0.298930  0.7652
I(week.f == 1 & trt == "Succimer")TRUE -11.138  1.418393 -7.852550  0.0000
I(week.f == 4 & trt == "Succimer")TRUE  -8.556  1.418393 -6.032180  0.0000
I(week.f == 6 & trt == "Succimer")TRUE  -2.884  1.418393 -2.033287  0.0429

 Correlation: 
                                       (Intr) I(.==1 I(.==4 I=1&t=" I=4&t="
I(week.f == 1)TRUE                     -0.707                              
I(week.f == 4)TRUE                     -0.707  0.500                       
I(week.f == 1 & trt == "Succimer")TRUE  0.000 -0.500  0.000                
I(week.f == 4 & trt == "Succimer")TRUE  0.000  0.000 -0.500  0.000         
I(week.f == 6 & trt == "Succimer")TRUE -0.707  0.500  0.500  0.000   0.000 

Standardized residuals:
       Min         Q1        Med         Q3        Max 
-2.3494198 -0.6575048 -0.1467858  0.5279214  6.0826597 

Residual standard error: 7.091964 
Degrees of freedom: 300 total; 293 residual

这是上面链接的网站上显示的输出:

Generalized least squares fit by REML
Model: y ~ I(week.f == 1) + I(week.f == 4) + I(week.f == 6) + I(week.f == 1 & trt == "Succimer") + I(week.f == 4 & trt == "Succimer") + I(week.f == 6 & trt == "Succimer") 

  Data: NULL 
       AIC      BIC    logLik
  2451.990 2519.544 -1208.995

Correlation Structure: General
 Formula: ~time | id 
 Parameter estimate(s):
 Correlation: 
  1     2     3    
2 0.569            
3 0.568 0.775      
4 0.575 0.581 0.580

Variance function:
 Structure: Different standard deviations per stratum
 Formula: ~1 | week.f 
 Parameter estimates:
       0        1        4        6 
1.000000 1.330103 1.374827 1.529615 

Coefficients:
                                            Value Std.Error   t-value p-value
(Intercept)                             26.406000 0.4998908  52.82354  0.0000
I(week.f == 1)TRUE                      -1.644501 0.7824044  -2.10186  0.0362
I(week.f == 4)TRUE                      -2.231356 0.8073811  -2.76370  0.0060
I(week.f == 6)TRUE                      -2.642065 0.8864616  -2.98046  0.0031
I(week.f == 1 & trt == "Succimer")TRUE -11.340998 1.0931205 -10.37488  0.0000
I(week.f == 4 & trt == "Succimer")TRUE  -8.765288 1.1312570  -7.74827  0.0000
I(week.f == 6 & trt == "Succimer")TRUE  -3.119869 1.2507776  -2.49434  0.0130

 Correlation: 
                                       (Intr) I(.==1 I(.==4 I(.==6 I=1&t="
I(week.f == 1)TRUE                     -0.155                             
I(week.f == 4)TRUE                     -0.136  0.674                      
I(week.f == 6)TRUE                     -0.068  0.381  0.380               
I(week.f == 1 & trt == "Succimer")TRUE  0.000 -0.699 -0.467 -0.265        
I(week.f == 4 & trt == "Succimer")TRUE  0.000 -0.466 -0.701 -0.265  0.667 
I(week.f == 6 & trt == "Succimer")TRUE  0.000 -0.263 -0.263 -0.705  0.376 

                                       I=4&t="

I(week.f == 1)TRUE                            
I(week.f == 4)TRUE                            
I(week.f == 6)TRUE                            
I(week.f == 1 & trt == "Succimer")TRUE        
I(week.f == 4 & trt == "Succimer")TRUE        
I(week.f == 6 & trt == "Succimer")TRUE  0.375 

Standardized residuals:
       Min         Q1        Med         Q3        Max 
-2.1636401 -0.7011814 -0.1426534  0.5374840  5.6570302 
 
Residual standard error: 4.998908 
Degrees of freedom: 400 total; 393 residual

注意

  1. AIC、BIC 和 LogLik 值都不同
  2. 相关输出完全不同。我的输出全部为零,并且缺少相关矩阵中的整个维度。
  3. 我的方差函数参数估计完全不同;也缺少网站上显示的 0 输出。
  4. 在大多数情况下,估计系数大不相同;我的输出也完全缺少 week.f == 6 个结果。
  5. 系数结果下方输出的相关矩阵完全不同
  6. 我的输出显示总共有 300 个自由度; Fitzmaurice 说总共有 400 个(输出的最后一部分的其余部分也明显不同)。

此外,当我 运行 代码时,我收到一条错误消息

Error in glsEstimate(object, control = control) : 
  computed "gls" fit is singular, rank 7

因此,为了获得任何输出,我必须将此参数添加到 gls() 函数中:

control = list(singular.ok = TRUE)

所以我的问题是,为什么当我自己 运行 代码时输出如此不同?显然,制作网站的人忽略了正确复制代码。

如果您想尝试 运行自己编写代码,上面链接的网站有数据集 tlc.dta 可供下载。

稍微更改 Fitzmaurice 的代码,以包含时间 == 1(因此周 == 0),我得到与网站上所示相同的输出:

ds <- foreign::read.dta("tlc.dta")
ds$baseline <- ds$y0
tlclong <- reshape(ds, idvar="id", varying=c("y0","y1","y4","y6"),
                   v.names="y", timevar="time", time=1:4, direction="long")
# tlclong <- subset(tlclong, time > 1)
attach(tlclong)

week <- time
week[time==1] <- 0
week[time==2] <- 1
week[time==3] <- 4
week[time==4] <- 6
time <- time - 1
week.f <- factor(week, c(0,1,4,6))
change <- y - baseline
cbaseline <- baseline - 26.406

library(nlme)
model <- gls(y ~ week.f +
                 I(week.f==1 & trt=="Succimer") +
                 I(week.f==4 & trt=="Succimer") +
                 I(week.f==6 & trt=="Succimer"),
             corr=corSymm(form= ~ time | id),
             weights = varIdent(form = ~ 1 | week.f)
             )
summary(model)

所以总而言之,我应该将我的问题的措辞从“nlme 的输出现在不同”更改为“Fitzmaurice 忘记输入他的代码本来肯定是什么,以便在他的网站上显示输出”。我的错。