使用 `plm()` 估计具有嵌套结构的重复测量随机效应模型
estimate a repeated measures random effects model with a nested structure using `plm()`
是否可以使用 plm 包中的 plm()
估计具有嵌套结构 的 重复测量随机效应模型?
我知道 lme4 包中的 lmer()
是可行的。但是,lmer()
依赖似然框架,我很想用 plm()
来做。
这是我的最小工作示例,灵感来自 this question。首先是一些需要的包和数据,
# install.packages(c("plm", "lme4", "texreg", "mlmRev"), dependencies = TRUE)
data(egsingle, package = "mlmRev")
数据集 egsingle
是一个不平衡的面板,由 1721 名学童组成,分为 60 所学校,跨越五个时间点。有关详细信息,请参阅 ?mlmRev::egsingle
一些轻数据管理
dta <- egsingle
dta$Female <- with(dta, ifelse(female == 'Female', 1, 0))
此外,相关数据的片段
dta[118:127,c('schoolid','childid','math','year','size','Female')]
#> schoolid childid math year size Female
#> 118 2040 289970511 -1.830 -1.5 502 1
#> 119 2040 289970511 -1.185 -0.5 502 1
#> 120 2040 289970511 0.852 0.5 502 1
#> 121 2040 289970511 0.573 1.5 502 1
#> 122 2040 289970511 1.736 2.5 502 1
#> 123 2040 292772811 -3.144 -1.5 502 0
#> 124 2040 292772811 -2.097 -0.5 502 0
#> 125 2040 292772811 -0.316 0.5 502 0
#> 126 2040 293550291 -2.097 -1.5 502 0
#> 127 2040 293550291 -1.314 -0.5 502 0
现在,严重依赖 Robert Long 的回答,这就是我使用嵌套结构估计重复测量随机效应模型的方式lmer()
来自 lme4 包,
dta$year <- as.factor(dta$year)
require(lme4)
Model.1 <- lmer(math ~ Female + size + year + (1 | schoolid /childid), dta)
# summary(Model.1)
我在 man 页面中查找 plm()
,它有一个索引命令 index
,但它只需要一个索引和 time,即 index = c("childid", "year")
,忽略 schoolid
模型看起来像这样,
dta$year <- as.numeric(dta$year)
library(plm)
Model.2 <- plm(math~Female+size+year, dta, index = c("childid", "year"), model="random")
# summary(Model.2)
总结一下问题
How can I, or is it even possible, to specify a repeated measures random effects model with a nested structure, like Model.1
, using plm()
from the plm package?
下面是两个模型的实际估计结果,
# require(texreg)
names(Model.2$coefficients) <- names(coefficients(Model.1)$schoolid) #ugly!
texreg::screenreg(list(Model.1, Model.2), digits = 3) # pretty!
#> ==============================================================
#> Model 1 Model 2
#> --------------------------------------------------------------
#> (Intercept) -2.693 *** -2.671 ***
#> (0.152) (0.085)
#> Female 0.008 -0.025
#> (0.042) (0.046)
#> size -0.000 -0.000 ***
#> (0.000) (0.000)
#> year-1.5 0.866 *** 0.878 ***
#> (0.059) (0.059)
#> year-0.5 1.870 *** 1.882 ***
#> (0.058) (0.059)
#> year0.5 2.562 *** 2.575 ***
#> (0.059) (0.059)
#> year1.5 3.133 *** 3.149 ***
#> (0.059) (0.060)
#> year2.5 3.939 *** 3.956 ***
#> (0.060) (0.060)
#> --------------------------------------------------------------
#> AIC 16590.715
#> BIC 16666.461
#> Log Likelihood -8284.357
#> Num. obs. 7230 7230
#> Num. groups: childid:schoolid 1721
#> Num. groups: schoolid 60
#> Var: childid:schoolid (Intercept) 0.672
#> Var: schoolid (Intercept) 0.180
#> Var: Residual 0.334
#> R^2 0.004
#> Adj. R^2 0.003
#> ==============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05
基于 I wrote the following model specification for a repeated measures random effects model with a nested structure, in plm()
from the plm package using Wallace and Hussain's (1969)方法,即random.method = "walhus"
,用于估计方差分量,
p_dta <- pdata.frame(dta, index = c("childid", "year", "schoolid"))
Model.3 <- plm(math ~ Female + size + year, data = p_dta, model = "random",
effect = "nested", random.method = "walhus")
如我所料,Model.3
中的结果与 Model.1
中的估计值几乎相同。只有拦截略有不同(见下面的输出)。
I wrote the above based on the example from Baltagi, Song and Jung (2001) provided in ?plm
. In the Baltagi, Song and Jung (2001)-example the variance components are estimated first using Swamy and Arora (1972), i.e. random.method = "swar"
, and second with using Wallace and Hussain's (1969). Only the Nerlove (1971) transformation does not converge using the Song and Jung (2001)-data. Whereas it was only Wallace and Hussain's (1969)-method that could converge using the egsingle
data-set.
Any authoritative references on this would be appreciated. I'll keep working at it.
names(Model.3$coefficients) <- names(coefficients(Model.1)$schoolid)
texreg::screenreg(list(Model.1, Model.3), digits = 3,
custom.model.names = c('Model 1', 'Model 3'))
#> ==============================================================
#> Model 1 Model 3
#> --------------------------------------------------------------
#> (Intercept) -2.693 *** -2.697 ***
#> (0.152) (0.152)
#> Female 0.008 0.008
#> (0.042) (0.042)
#> size -0.000 -0.000
#> (0.000) (0.000)
#> year-1.5 0.866 *** 0.866 ***
#> (0.059) (0.059)
#> year-0.5 1.870 *** 1.870 ***
#> (0.058) (0.058)
#> year0.5 2.562 *** 2.562 ***
#> (0.059) (0.059)
#> year1.5 3.133 *** 3.133 ***
#> (0.059) (0.059)
#> year2.5 3.939 *** 3.939 ***
#> (0.060) (0.060)
#> --------------------------------------------------------------
#> AIC 16590.715
#> BIC 16666.461
#> Log Likelihood -8284.357
#> Num. obs. 7230 7230
#> Num. groups: childid:schoolid 1721
#> Num. groups: schoolid 60
#> Var: childid:schoolid (Intercept) 0.672
#> Var: schoolid (Intercept) 0.180
#> Var: Residual 0.334
#> R^2 0.000
#> Adj. R^2 -0.001
#> ==============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05#>
是否可以使用 plm 包中的 plm()
估计具有嵌套结构 的 重复测量随机效应模型?
我知道 lme4 包中的 lmer()
是可行的。但是,lmer()
依赖似然框架,我很想用 plm()
来做。
这是我的最小工作示例,灵感来自 this question。首先是一些需要的包和数据,
# install.packages(c("plm", "lme4", "texreg", "mlmRev"), dependencies = TRUE)
data(egsingle, package = "mlmRev")
数据集 egsingle
是一个不平衡的面板,由 1721 名学童组成,分为 60 所学校,跨越五个时间点。有关详细信息,请参阅 ?mlmRev::egsingle
一些轻数据管理
dta <- egsingle
dta$Female <- with(dta, ifelse(female == 'Female', 1, 0))
此外,相关数据的片段
dta[118:127,c('schoolid','childid','math','year','size','Female')]
#> schoolid childid math year size Female
#> 118 2040 289970511 -1.830 -1.5 502 1
#> 119 2040 289970511 -1.185 -0.5 502 1
#> 120 2040 289970511 0.852 0.5 502 1
#> 121 2040 289970511 0.573 1.5 502 1
#> 122 2040 289970511 1.736 2.5 502 1
#> 123 2040 292772811 -3.144 -1.5 502 0
#> 124 2040 292772811 -2.097 -0.5 502 0
#> 125 2040 292772811 -0.316 0.5 502 0
#> 126 2040 293550291 -2.097 -1.5 502 0
#> 127 2040 293550291 -1.314 -0.5 502 0
现在,严重依赖 Robert Long 的回答,这就是我使用嵌套结构估计重复测量随机效应模型的方式lmer()
来自 lme4 包,
dta$year <- as.factor(dta$year)
require(lme4)
Model.1 <- lmer(math ~ Female + size + year + (1 | schoolid /childid), dta)
# summary(Model.1)
我在 man 页面中查找 plm()
,它有一个索引命令 index
,但它只需要一个索引和 time,即 index = c("childid", "year")
,忽略 schoolid
模型看起来像这样,
dta$year <- as.numeric(dta$year)
library(plm)
Model.2 <- plm(math~Female+size+year, dta, index = c("childid", "year"), model="random")
# summary(Model.2)
总结一下问题
How can I, or is it even possible, to specify a repeated measures random effects model with a nested structure, like
Model.1
, usingplm()
from the plm package?
下面是两个模型的实际估计结果,
# require(texreg)
names(Model.2$coefficients) <- names(coefficients(Model.1)$schoolid) #ugly!
texreg::screenreg(list(Model.1, Model.2), digits = 3) # pretty!
#> ==============================================================
#> Model 1 Model 2
#> --------------------------------------------------------------
#> (Intercept) -2.693 *** -2.671 ***
#> (0.152) (0.085)
#> Female 0.008 -0.025
#> (0.042) (0.046)
#> size -0.000 -0.000 ***
#> (0.000) (0.000)
#> year-1.5 0.866 *** 0.878 ***
#> (0.059) (0.059)
#> year-0.5 1.870 *** 1.882 ***
#> (0.058) (0.059)
#> year0.5 2.562 *** 2.575 ***
#> (0.059) (0.059)
#> year1.5 3.133 *** 3.149 ***
#> (0.059) (0.060)
#> year2.5 3.939 *** 3.956 ***
#> (0.060) (0.060)
#> --------------------------------------------------------------
#> AIC 16590.715
#> BIC 16666.461
#> Log Likelihood -8284.357
#> Num. obs. 7230 7230
#> Num. groups: childid:schoolid 1721
#> Num. groups: schoolid 60
#> Var: childid:schoolid (Intercept) 0.672
#> Var: schoolid (Intercept) 0.180
#> Var: Residual 0.334
#> R^2 0.004
#> Adj. R^2 0.003
#> ==============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05
基于plm()
from the plm package using Wallace and Hussain's (1969)方法,即random.method = "walhus"
,用于估计方差分量,
p_dta <- pdata.frame(dta, index = c("childid", "year", "schoolid"))
Model.3 <- plm(math ~ Female + size + year, data = p_dta, model = "random",
effect = "nested", random.method = "walhus")
如我所料,Model.3
中的结果与 Model.1
中的估计值几乎相同。只有拦截略有不同(见下面的输出)。
I wrote the above based on the example from Baltagi, Song and Jung (2001) provided in
?plm
. In the Baltagi, Song and Jung (2001)-example the variance components are estimated first using Swamy and Arora (1972), i.e.random.method = "swar"
, and second with using Wallace and Hussain's (1969). Only the Nerlove (1971) transformation does not converge using the Song and Jung (2001)-data. Whereas it was only Wallace and Hussain's (1969)-method that could converge using theegsingle
data-set.Any authoritative references on this would be appreciated. I'll keep working at it.
names(Model.3$coefficients) <- names(coefficients(Model.1)$schoolid)
texreg::screenreg(list(Model.1, Model.3), digits = 3,
custom.model.names = c('Model 1', 'Model 3'))
#> ==============================================================
#> Model 1 Model 3
#> --------------------------------------------------------------
#> (Intercept) -2.693 *** -2.697 ***
#> (0.152) (0.152)
#> Female 0.008 0.008
#> (0.042) (0.042)
#> size -0.000 -0.000
#> (0.000) (0.000)
#> year-1.5 0.866 *** 0.866 ***
#> (0.059) (0.059)
#> year-0.5 1.870 *** 1.870 ***
#> (0.058) (0.058)
#> year0.5 2.562 *** 2.562 ***
#> (0.059) (0.059)
#> year1.5 3.133 *** 3.133 ***
#> (0.059) (0.059)
#> year2.5 3.939 *** 3.939 ***
#> (0.060) (0.060)
#> --------------------------------------------------------------
#> AIC 16590.715
#> BIC 16666.461
#> Log Likelihood -8284.357
#> Num. obs. 7230 7230
#> Num. groups: childid:schoolid 1721
#> Num. groups: schoolid 60
#> Var: childid:schoolid (Intercept) 0.672
#> Var: schoolid (Intercept) 0.180
#> Var: Residual 0.334
#> R^2 0.000
#> Adj. R^2 -0.001
#> ==============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05#>