如何在 clubSandwich 的 vcovCR() 中对标准错误进行聚类？

Question

我试图在 plm 之后使用 clubSandwich 包中的 vcovCR() 为我的模拟数据（我用于功率模拟）指定一个簇变量，但我收到以下错误消息： "Error in [.data.frame(eval(mf$data, envir), , index_names) : undefined columns selected"

我不确定这是否特定于 vcovCR() 或关于 R 的一般性内容，但有人能告诉我我的代码有什么问题吗？（我在这里How to cluster standard errors of plm at different level rather than id or time?看到了一个相关的post，但是并没有解决我的问题。

我的代码：

N <- 100;id <- 1:N;id <- c(id,id);gid <- 1:(N/2);
gid <- c(gid,gid,gid,gid);T <- rep(0,N);T = c(T,T+1)
a <- qnorm(runif(N),mean=0,sd=0.005)
gp <- qnorm(runif(N/2),mean=0,sd=0.0005)
u <- qnorm(runif(N*2),mean=0,sd=0.05)
a <- c(a,a);gp = c(gp,gp,gp,gp)
Ylatent <- -0.05*T + a + u
Data <- data.frame(
  Y = ifelse(Ylatent > 0, 1, 0),
  id = id,gid = gid,T = T
)
library(clubSandwich)
library(plm)
fe.fit <- plm(formula = Y ~ T, data = Data, model = "within", index = "id",effect = "individual", singular.ok = FALSE)
vcovCR(fe.fit,cluster=Data$id,type = "CR2") # doesn't work, but I can run this by not specifying cluster as in the next line
vcovCR(fe.fit,type = "CR2")
vcovCR(fe.fit,cluster=Data$gid,type = "CR2") # I ultimately want to run this

Answer 1

首先使您的数据成为 pdata.frame。这样更安全，特别是如果您希望自动创建时间索引（查看您的代码似乎就是这种情况）。

继续你的内容：

pData <- pdata.frame(Data, index = "id") # time index is created automatically
fe.fit2 <- plm(formula = Y ~ T, data = pData, model = "within", effect = "individual")
vcovCR(fe.fit2, cluster=Data$id,type = "CR2")
vcovCR(fe.fit2, type = "CR2")
vcovCR(fe.fit2,cluster=Data$gid,type = "CR2")

由于 clubSandwich 的 plm 对象的数据提取函数 get_index_order（从版本 0.3.3 开始）存在错误，您的示例无法运行。它假定两个索引变量都在原始数据中，但在您的示例中情况并非如此，在该示例中，时间索引是通过仅通过 index 参数指定单个维度来自动创建的。

如何在 clubSandwich 的 vcovCR() 中对标准错误进行聚类？

How to cluster standard error in clubSandwich's vcovCR()?

r

standard-error

plm