如何使 R 中不整齐的模型输出
How to make outputs of models which are not in broom tidy in R
我一直在尝试使 wfe
模型的输出整洁,以便我可以轻松地将其合并到 ggplot 等中。这是我在使用其他包和统计模型时遇到的问题不包括在扫帚中。
假设我创建了一个这样的数据集:(取自 wfe
的文件):
library (wfe)
## generate panel data with number of units = N, number of time = Time
N <- 10 # number of distinct units
Time <- 15 # number of distinct time
## treatment effect
beta <- 1
## generate treatment variable
treat <- matrix(rbinom(N*Time, size = 1, 0.25), ncol = N)
## make sure at least one observation is treated for each unit
while ((sum(apply(treat, 2, mean) == 0) > 0) | (sum(apply(treat, 2, mean) == 1) > 0) |
(sum(apply(treat, 1, mean) == 0) > 0) | (sum(apply(treat, 1, mean) == 1) > 0)) {
treat <- matrix(rbinom(N*Time, size = 1, 0.25), ncol = N)
}
treat.vec <- c(treat)
## unit fixed effects
alphai <- rnorm(N, mean = apply(treat, 2, mean))
## geneate two random covariates
x1 <- matrix(rnorm(N*Time, 0.5,1), ncol=N)
x2 <- matrix(rbeta(N*Time, 5,1), ncol=N)
x1.vec <- c(x1)
x2.vec <- c(x2)
## generate outcome variable
y <- matrix(NA, ncol = N, nrow = Time)
for (i in 1:N) {
y[, i] <- alphai[i] + treat[, i] + x1[,i] + x2[,i] + rnorm(Time)
}
y.vec <- c(y)
## generate unit and time index
unit.index <- rep(1:N, each = Time)
time.index <- rep(1:Time, N)
Data.obs <- as.data.frame(cbind(y.vec, treat.vec, unit.index, time.index, x1.vec, x2.vec))
colnames(Data.obs) <- c("y", "tr", "unit", "time", "x1", "x2")
现在我 运行 来自函数 wfe 的模型(同样,代码来自包的帮助文件):
mod.did <- wfe(y~ tr+x1+x2, data = Data.obs, treat = "tr",
unit.index = "unit", time.index = "time", method = "unit",
qoi = "ate", estimator ="did", hetero.se=TRUE, auto.se=TRUE,
White = TRUE, White.alpha = 0.05, verbose = TRUE)
## summarize the results
summary(mod.did)
我的问题是如何将此输出变成我可以绘制的整洁对象。
如果我调用 tidy(mod.did)
,我会收到以下错误:
Error: No tidy method for objects of class wfedid
我明白了,但我不确定如何解决。我尝试将各个参数(系数、se 等)映射到一个新的列表对象中,但这没有用,所以我希望这里有人知道更系统的方法来做到这一点。
如果有帮助,这里是输出的输入:https://pastebin.com/HTkKEUUQ
谢谢!
这里是 tidy
方法的开始:
library(dplyr); library(tibble)
tidy.wfedid <- function(x, conf.int=FALSE, conf.level=0.95, ...) {
cc <- (coef(summary(x))
%>% as.data.frame()
%>% setNames(c("estimate","std.error","statistic","p.value"))
%>% tibble::rownames_to_column("term")
%>% as_tibble()
)
return(cc)
}
请注意 (1) 我还没有实现置信区间的东西(你可以通过使用 mutate
添加列来做到这一点 (conf.low, conf.high)
= term
± std.error*qnorm((1+conf.level)/2)
; (2) 这给出了标准的“整洁”方法,它给出了一个系数 table。如果你想要预测和预测的置信区间,你将需要编写一个 augment
方法 ...
我一直在尝试使 wfe
模型的输出整洁,以便我可以轻松地将其合并到 ggplot 等中。这是我在使用其他包和统计模型时遇到的问题不包括在扫帚中。
假设我创建了一个这样的数据集:(取自 wfe
的文件):
library (wfe)
## generate panel data with number of units = N, number of time = Time
N <- 10 # number of distinct units
Time <- 15 # number of distinct time
## treatment effect
beta <- 1
## generate treatment variable
treat <- matrix(rbinom(N*Time, size = 1, 0.25), ncol = N)
## make sure at least one observation is treated for each unit
while ((sum(apply(treat, 2, mean) == 0) > 0) | (sum(apply(treat, 2, mean) == 1) > 0) |
(sum(apply(treat, 1, mean) == 0) > 0) | (sum(apply(treat, 1, mean) == 1) > 0)) {
treat <- matrix(rbinom(N*Time, size = 1, 0.25), ncol = N)
}
treat.vec <- c(treat)
## unit fixed effects
alphai <- rnorm(N, mean = apply(treat, 2, mean))
## geneate two random covariates
x1 <- matrix(rnorm(N*Time, 0.5,1), ncol=N)
x2 <- matrix(rbeta(N*Time, 5,1), ncol=N)
x1.vec <- c(x1)
x2.vec <- c(x2)
## generate outcome variable
y <- matrix(NA, ncol = N, nrow = Time)
for (i in 1:N) {
y[, i] <- alphai[i] + treat[, i] + x1[,i] + x2[,i] + rnorm(Time)
}
y.vec <- c(y)
## generate unit and time index
unit.index <- rep(1:N, each = Time)
time.index <- rep(1:Time, N)
Data.obs <- as.data.frame(cbind(y.vec, treat.vec, unit.index, time.index, x1.vec, x2.vec))
colnames(Data.obs) <- c("y", "tr", "unit", "time", "x1", "x2")
现在我 运行 来自函数 wfe 的模型(同样,代码来自包的帮助文件):
mod.did <- wfe(y~ tr+x1+x2, data = Data.obs, treat = "tr",
unit.index = "unit", time.index = "time", method = "unit",
qoi = "ate", estimator ="did", hetero.se=TRUE, auto.se=TRUE,
White = TRUE, White.alpha = 0.05, verbose = TRUE)
## summarize the results
summary(mod.did)
我的问题是如何将此输出变成我可以绘制的整洁对象。
如果我调用 tidy(mod.did)
,我会收到以下错误:
Error: No tidy method for objects of class wfedid
我明白了,但我不确定如何解决。我尝试将各个参数(系数、se 等)映射到一个新的列表对象中,但这没有用,所以我希望这里有人知道更系统的方法来做到这一点。 如果有帮助,这里是输出的输入:https://pastebin.com/HTkKEUUQ
谢谢!
这里是 tidy
方法的开始:
library(dplyr); library(tibble)
tidy.wfedid <- function(x, conf.int=FALSE, conf.level=0.95, ...) {
cc <- (coef(summary(x))
%>% as.data.frame()
%>% setNames(c("estimate","std.error","statistic","p.value"))
%>% tibble::rownames_to_column("term")
%>% as_tibble()
)
return(cc)
}
请注意 (1) 我还没有实现置信区间的东西(你可以通过使用 mutate
添加列来做到这一点 (conf.low, conf.high)
= term
± std.error*qnorm((1+conf.level)/2)
; (2) 这给出了标准的“整洁”方法,它给出了一个系数 table。如果你想要预测和预测的置信区间,你将需要编写一个 augment
方法 ...