面板数据的更高级别集群标准错误

Higher level cluster standard errors for panel data

我想估计 R 中面板模型(一阶差分)的聚类 SE,有 100 个组、6,156 个个体和 15 年。一些 这些个体被重复(4,201 个唯一),因为它们是 一对一、替换、匹配获得的匹配样本 方法。

我一直在使用 plm 来估计模型系数,之后 使用个人将我匹配的样本转换为 pdata.frame 和年份作为指标。

我也能估计集群 使用 vcovHC 函数在个体层面的标准误差。

然而,这些人聚集在群体中,并且 因此我想在这个更高级别的聚合而不是 而不是在个人层面。

不幸的是,我不清楚如何 继续。当然,如果我将个人替换为团体中的 索引我重复 row.names 然后我无法估计面板 带plm的模型。我收到以下错误消息:

Error in row.names<-.data.frame (*tmp*, value = c("1-1", "1-1", "1-1", : duplicate 'row.names' are not allowed

为简单起见,我使用以下示例(复制 来自:http://www.richard-bluhm.com/clustered-ses-in-r-and-stata-2/):

# require packages
require(plm)
#> Loading required package: plm
require(lmtest)
#> Loading required package: lmtest
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
# get data and load as pdata.frame
url <- "http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.txt"
p.df <- read.table(url)
names(p.df) <- c("firmid", "year", "x", "y")
# Introduce group (State) Id
p.df$State <- rep(1:100, each = 50)
p.df2 <- pdata.frame(p.df, index = c("State", "year"), drop.index = F, row.names = T)
#> Warning in pdata.frame(p.df, index = c("State", "year"), drop.index = F, : duplicate couples (id-time) in resulting pdata.frame
#>  to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
# fit model with plm
pm1 <- plm(y ~ x, data = p.df2, model = "within") # this is where the error occurs.
#> Warning: non-unique values when setting 'row.names': '1-1', '1-10', '1-2',
#> '1-3', '1-4', '1-5', '1-6', '1-7', '1-8', '1-9', '10-1', '10-10', '10-2',
#> '10-3', '10-4', '10-5', '10-6', '10-7', '10-8', '10-9', '100-1', '100-10',
#> '100-2', '100-3', '100-4', '100-5', '100-6', '100-7', '100-8', '100-9', '11-1',
#> '11-10', '11-2', '11-3', '11-4', '11-5', '11-6', '11-7', '11-8', '11-9', '12-1',
#> '12-10', '12-2', '12-3', '12-4', '12-5', '12-6', '12-7', '12-8', '12-9', '13-1',
#> '13-10', '13-2', '13-3', '13-4', '13-5', '13-6', '13-7', '13-8', '13-9', '14-1',
#> '14-10', '14-2', '14-3', '14-4', '14-5', '14-6', '14-7', '14-8', '14-9', '15-1',
#> '15-10', '15-2', '15-3', '15-4', '15-5', '15-6', '15-7', '15-8', '15-9', '16-1',
#> '16-10', '16-2', '16-3', '16-4', '16-5', '16-6', '16-7', '16-8', '16-9', '17-1',
#> '17-10', '17-2', '17-3', '17-4', '17-5', '17-6', '17-7', '17-8', '17-9', '18-1',
#> '18-10', '18-2', '18-3', '18-4', '18-5', '18-6', '18-7', '18-8', '18-9', '19-1',
#> '19-10', '19-2', '19-3', '19-4', '19-5', '19-6', '19-7', '19-8', '19-9', '2-1',
#> '2-10', '2-2', '2-3', '2-4', '2-5', '2-6', '2-7', '2-8', '2-9', '20-1', '20-10',
#> '20-2', '20-3', '20-4', '20-5', '20-6', '20-7', '20-8', '20-9', '21-1', '21-10',
#> '21-2', '21-3', '21-4', '21-5', '21-6', '21-7', '21-8', '21-9', '22-1', '22-10',
#> '22-2', '22-3', '22-4', '22-5', '22-6', '22-7', '22-8', '22-9', '23-1', '23-10',
#> '23-2', '23-3', '23-4', '23-5', '23-6', '23-7', '23-8', '23-9', '24-1', '24-10',
#> '24-2', '24-3', '24-4', '24-5', '24-6', '24-7', '24-8', '24-9', '25-1', '25-10',
#> '25-2', '25-3', '25-4', '25-5', '25-6', '25-7', '25-8', '25-9', '26-1', '26-10',
#> '26-2', '26-3', '26-4', '26-5', '26-6', '26-7', '26-8', '26-9', '27-1', '27-10',
#> '27-2', '27-3', '27-4', '27-5', '27-6', '27-7', '27-8', '27-9', '28-1', '28-10',
#> '28-2', '28-3', '28-4', '28-5', '28-6', '28-7', '28-8', '28-9', '29-1', '29-10',
#> '29-2', '29-3', '29-4', '29-5', '29-6', '29-7', '29-8', '29-9', '3-1', '3-10',
#> '3-2', '3-3', '3-4', '3-5', '3-6', '3-7', '3-8', '3-9', '30-1', '30-10', '30-2',
#> '30-3', '30-4', '30-5', '30-6', '30-7', '30-8', '30-9', '31-1', '31-10', '31-2',
#> '31-3', '31-4', '31-5', '31-6', '31-7', '31-8', '31-9', '32-1', '32-10', '32-2',
#> '32-3', '32-4', '32-5', '32-6', '32-7', '32-8', '32-9', '33-1', '33-10', '33-2',
#> '33-3', '33-4', '33-5', '33-6', '33-7', '33-8', '33-9', '34-1', '34-10', '34-2',
#> '34-3', '34-4', '34-5', '34-6', '34-7', '34-8', '34-9', '35-1', '35-10', '35-2',
#> '35-3', '35-4', '35-5', '35-6', '35-7', '35-8', '35-9', '36-1', '36-10', '36-2',
#> '36-3', '36-4', '36-5', '36-6', '36-7', '36-8', '36-9', '37-1', '37-10', '37-2',
#> '37-3', '37-4', '37-5', '37-6', '37-7', '37-8', '37-9', '38-1', '38-10', '38-2',
#> '38-3', '38-4', '38-5', '38-6', '38-7', '38-8', '38-9', '39-1', '39-10', '39-2',
#> '39-3', '39-4', '39-5', '39-6', '39-7', '39-8', '39-9', '4-1', '4-10', '4-2',
#> '4-3', '4-4', '4-5', '4-6', '4-7', '4-8', '4-9', '40-1', '40-10', '40-2',
#> '40-3', '40-4', '40-5', '40-6', '40-7', '40-8', '40-9', '41-1', '41-10', '41-2',
#> '41-3', '41-4', '41-5', '41-6', '41-7', '41-8', '41-9', '42-1', '42-10', '42-2',
#> '42-3', '42-4', '42-5', '42-6', '42-7', '42-8', '42-9', '43-1', '43-10', '43-2',
#> '43-3', '43-4', '43-5', '43-6', '43-7', '43-8', '43-9', '44-1', '44-10', '44-2',
#> '44-3', '44-4', '44-5', '44-6', '44-7', '44-8', '44-9', '45-1', '45-10', '45-2',
#> '45-3', '45-4', '45-5', '45-6', '45-7', '45-8', '45-9', '46-1', '46-10', '46-2',
#> '46-3', '46-4', '46-5', '46-6', '46-7', '46-8', '46-9', '47-1', '47-10', '47-2',
#> '47-3', '47-4', '47-5', '47-6', '47-7', '47-8', '47-9', '48-1', '48-10', '48-2',
#> '48-3', '48-4', '48-5', '48-6', '48-7', '48-8', '48-9', '49-1', '49-10', '49-2',
#> '49-3', '49-4', '49-5', '49-6', '49-7', '49-8', '49-9', '5-1', '5-10', '5-2',
#> '5-3', '5-4', '5-5', '5-6', '5-7', '5-8', '5-9', '50-1', '50-10', '50-2',
#> '50-3', '50-4', '50-5', '50-6', '50-7', '50-8', '50-9', '51-1', '51-10', '51-2',
#> '51-3', '51-4', '51-5', '51-6', '51-7', '51-8', '51-9', '52-1', '52-10', '52-2',
#> '52-3', '52-4', '52-5', '52-6', '52-7', '52-8', '52-9', '53-1', '53-10', '53-2',
#> '53-3', '53-4', '53-5', '53-6', '53-7', '53-8', '53-9', '54-1', '54-10', '54-2',
#> '54-3', '54-4', '54-5', '54-6', '54-7', '54-8', '54-9', '55-1', '55-10', '55-2',
#> '55-3', '55-4', '55-5', '55-6', '55-7', '55-8', '55-9', '56-1', '56-10', '56-2',
#> '56-3', '56-4', '56-5', '56-6', '56-7', '56-8', '56-9', '57-1', '57-10', '57-2',
#> '57-3', '57-4', '57-5', '57-6', '57-7', '57-8', '57-9', '58-1', '58-10', '58-2',
#> '58-3', '58-4', '58-5', '58-6', '58-7', '58-8', '58-9', '59-1', '59-10', '59-2',
#> '59-3', '59-4', '59-5', '59-6', '59-7', '59-8', '59-9', '6-1', '6-10', '6-2',
#> '6-3', '6-4', '6-5', '6-6', '6-7', '6-8', '6-9', '60-1', '60-10', '60-2',
#> '60-3', '60-4', '60-5', '60-6', '60-7', '60-8', '60-9', '61-1', '61-10', '61-2',
#> '61-3', '61-4', '61-5', '61-6', '61-7', '61-8', '61-9', '62-1', '62-10', '62-2',
#> '62-3', '62-4', '62-5', '62-6', '62-7', '62-8', '62-9', '63-1', '63-10', '63-2',
#> '63-3', '63-4', '63-5', '63-6', '63-7', '63-8', '63-9', '64-1', '64-10', '64-2',
#> '64-3', '64-4', '64-5', '64-6', '64-7', '64-8', '64-9', '65-1', '65-10', '65-2',
#> '65-3', '65-4', '65-5', '65-6', '65-7', '65-8', '65-9', '66-1', '66-10', '66-2',
#> '66-3', '66-4', '66-5', '66-6', '66-7', '66-8', '66-9', '67-1', '67-10', '67-2',
#> '67-3', '67-4', '67-5', '67-6', '67-7', '67-8', '67-9', '68-1', '68-10', '68-2',
#> '68-3', '68-4', '68-5', '68-6', '68-7', '68-8', '68-9', '69-1', '69-10', '69-2',
#> '69-3', '69-4', '69-5', '69-6', '69-7', '69-8', '69-9', '7-1', '7-10', '7-2',
#> '7-3', '7-4', '7-5', '7-6', '7-7', '7-8', '7-9', '70-1', '70-10', '70-2',
#> '70-3', '70-4', '70-5', '70-6', '70-7', '70-8', '70-9', '71-1', '71-10', '71-2',
#> '71-3', '71-4', '71-5', '71-6', '71-7', '71-8', '71-9', '72-1', '72-10', '72-2',
#> '72-3', '72-4', '72-5', '72-6', '72-7', '72-8', '72-9', '73-1', '73-10', '73-2',
#> '73-3', '73-4', '73-5', '73-6', '73-7', '73-8', '73-9', '74-1', '74-10', '74-2',
#> '74-3', '74-4', '74-5', '74-6', '74-7', '74-8', '74-9', '75-1', '75-10', '75-2',
#> '75-3', '75-4', '75-5', '75-6', '75-7', '75-8', '75-9', '76-1', '76-10', '76-2',
#> '76-3', '76-4', '76-5', '76-6', '76-7', '76-8', '76-9', '77-1', '77-10', '77-2',
#> '77-3', '77-4', '77-5', '77-6', '77-7', '77-8', '77-9', '78-1', '78-10', '78-2',
#> '78-3', '78-4', '78-5', '78-6', '78-7', '78-8', '78-9', '79-1', '79-10', '79-2',
#> '79-3', '79-4', '79-5', '79-6', '79-7', '79-8', '79-9', '8-1', '8-10', '8-2',
#> '8-3', '8-4', '8-5', '8-6', '8-7', '8-8', '8-9', '80-1', '80-10', '80-2',
#> '80-3', '80-4', '80-5', '80-6', '80-7', '80-8', '80-9', '81-1', '81-10', '81-2',
#> '81-3', '81-4', '81-5', '81-6', '81-7', '81-8', '81-9', '82-1', '82-10', '82-2',
#> '82-3', '82-4', '82-5', '82-6', '82-7', '82-8', '82-9', '83-1', '83-10', '83-2',
#> '83-3', '83-4', '83-5', '83-6', '83-7', '83-8', '83-9', '84-1', '84-10', '84-2',
#> '84-3', '84-4', '84-5', '84-6', '84-7', '84-8', '84-9', '85-1', '85-10', '85-2',
#> '85-3', '85-4', '85-5', '85-6', '85-7', '85-8', '85-9', '86-1', '86-10', '86-2',
#> '86-3', '86-4', '86-5', '86-6', '86-7', '86-8', '86-9', '87-1', '87-10', '87-2',
#> '87-3', '87-4', '87-5', '87-6', '87-7', '87-8', '87-9', '88-1', '88-10', '88-2',
#> '88-3', '88-4', '88-5', '88-6', '88-7', '88-8', '88-9', '89-1', '89-10', '89-2',
#> '89-3', '89-4', '89-5', '89-6', '89-7', '89-8', '89-9', '9-1', '9-10', '9-2',
#> '9-3', '9-4', '9-5', '9-6', '9-7', '9-8', '9-9', '90-1', '90-10', '90-2',
#> '90-3', '90-4', '90-5', '90-6', '90-7', '90-8', '90-9', '91-1', '91-10', '91-2',
#> '91-3', '91-4', '91-5', '91-6', '91-7', '91-8', '91-9', '92-1', '92-10', '92-2',
#> '92-3', '92-4', '92-5', '92-6', '92-7', '92-8', '92-9', '93-1', '93-10', '93-2',
#> '93-3', '93-4', '93-5', '93-6', '93-7', '93-8', '93-9', '94-1', '94-10', '94-2',
#> '94-3', '94-4', '94-5', '94-6', '94-7', '94-8', '94-9', '95-1', '95-10', '95-2',
#> '95-3', '95-4', '95-5', '95-6', '95-7', '95-8', '95-9', '96-1', '96-10', '96-2',
#> '96-3', '96-4', '96-5', '96-6', '96-7', '96-8', '96-9', '97-1', '97-10', '97-2',
#> '97-3', '97-4', '97-5', '97-6', '97-7', '97-8', '97-9', '98-1', '98-10', '98-2',
#> '98-3', '98-4', '98-5', '98-6', '98-7', '98-8', '98-9', '99-1', '99-10', '99-2',
#> '99-3', '99-4', '99-5', '99-6', '99-7', '99-8', '99-9'
#> Error in `.rowNamesDF<-`(x, value = value): duplicate 'row.names' are not allowed

reprex package (v0.3.0)

于 2020-05-29 创建

当我遇到同样的问题时,我写了一个包(clubTamal)。 clubTamal 将 plm 对象(通过重新估计)转换为 lm 对象,以便能够使用 multiwayvcov 包对标准错误进行聚类。您可以在此处找到 Rpubs 示例和文档:https://rpubs.com/eliascis/clubTamal.

该包适用于 plm 具有固定效应 (model='within') 或一阶差分模型 (model='fd') 的估计。

要获得聚类协方差矩阵,请使用 vcovTamal 命令。

该软件包仍在开发中,但可以直接从 github 安装:

library(devtools)
install_github("eliascis/clubTamal")

不幸的是 link 对您的示例数据不起作用,但 clubTamal 进一步安装 spd4testing,它构建了一个模拟的小面板数据集用于测试目的。

## packages
library(foreign)
library(lmtest)
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
library(plm)
library(multiwayvcov)
library(spd4testing)
library(clubTamal)

## simulated data
d <- spd4testing()

## formula
f <- formula(y ~ x + factor(year))

## standard estimation
e <- plm(formula = f, data = d, model = "fd")
summary(e)
#> Oneway (individual) effect First-Difference Model
#> 
#> Call:
#> plm(formula = f, data = d, model = "fd")
#> 
#> Unbalanced Panel: n = 6, T = 3-5, N = 26
#> Observations used in estimation: 20
#> 
#> Residuals:
#>     Min.  1st Qu.   Median  3rd Qu.     Max. 
#> -250.333 -115.219   12.651  108.390  228.110 
#> 
#> Coefficients:
#>                  Estimate Std. Error t-value Pr(>|t|)  
#> (Intercept)       -80.937    199.405 -0.4059  0.69096  
#> x                  71.858     25.974  2.7666  0.01514 *
#> factor(year)2002  194.842    216.449  0.9002  0.38325  
#> factor(year)2003  109.118    414.298  0.2634  0.79609  
#> factor(year)2004  446.147    583.234  0.7650  0.45700  
#> factor(year)2005  451.514    752.479  0.6000  0.55807  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Total Sum of Squares:    1078300
#> Residual Sum of Squares: 394270
#> R-Squared:      0.63435
#> Adj. R-Squared: 0.50376
#> F-statistic: 4.85757 on 5 and 14 DF, p-value: 0.0087377
e <- plm(formula = f, data = d, model = "within")
summary(e)
#> Oneway (individual) effect Within Model
#> 
#> Call:
#> plm(formula = f, data = d, model = "within")
#> 
#> Unbalanced Panel: n = 6, T = 3-5, N = 26
#> 
#> Residuals:
#>      Min.   1st Qu.    Median   3rd Qu.      Max. 
#> -167.4294  -59.3741   -6.9404   73.7132  146.4199 
#> 
#> Coefficients:
#>                  Estimate Std. Error t-value Pr(>|t|)   
#> x                  72.362     23.434  3.0879 0.007501 **
#> factor(year)2002  113.786     77.276  1.4725 0.161569   
#> factor(year)2003  -67.413     75.012 -0.8987 0.383013   
#> factor(year)2004  200.420     83.649  2.3960 0.030062 * 
#> factor(year)2005  127.170     81.030  1.5694 0.137400   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Total Sum of Squares:    441370
#> Residual Sum of Squares: 190660
#> R-Squared:      0.56803
#> Adj. R-Squared: 0.28005
#> F-statistic: 3.94491 on 5 and 15 DF, p-value: 0.017501

## clustering
# no clustering
v <- e$vcov
coeftest(e)
#> 
#> t test of coefficients:
#> 
#>                  Estimate Std. Error t value Pr(>|t|)   
#> x                  72.362     23.434  3.0879 0.007501 **
#> factor(year)2002  113.786     77.276  1.4725 0.161569   
#> factor(year)2003  -67.413     75.012 -0.8987 0.383013   
#> factor(year)2004  200.420     83.649  2.3960 0.030062 * 
#> factor(year)2005  127.170     81.030  1.5694 0.137400   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# clustering at id level with plm-package
v <- vcovHC(e, type = "HC1", cluster = "group", tol = 1 * 10^-20)
coeftest(e, v)
#> 
#> t test of coefficients:
#> 
#>                  Estimate Std. Error t value Pr(>|t|)   
#> x                  72.362     24.586  2.9433 0.010070 * 
#> factor(year)2002  113.786     76.548  1.4865 0.157870   
#> factor(year)2003  -67.413     63.962 -1.0540 0.308585   
#> factor(year)2004  200.420     61.464  3.2608 0.005266 **
#> factor(year)2005  127.170     67.310  1.8893 0.078338 . 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## clustering at group level with clubTamal
v <- vcovTamal(estimate = e, data = d, groupvar = "gid")
#> Error in vcovTamal(estimate = e, data = d, groupvar = "gid"): better use the very fast and powerful lfe::felm
coeftest(e, v)
#> 
#> t test of coefficients:
#> 
#>                  Estimate Std. Error t value Pr(>|t|)   
#> x                  72.362     24.586  2.9433 0.010070 * 
#> factor(year)2002  113.786     76.548  1.4865 0.157870   
#> factor(year)2003  -67.413     63.962 -1.0540 0.308585   
#> factor(year)2004  200.420     61.464  3.2608 0.005266 **
#> factor(year)2005  127.170     67.310  1.8893 0.078338 . 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

reprex package (v0.3.0)

于 2020-05-29 创建