在每组的中心查找回归线的置信区间

Question

我有以下模拟数据来拟合回归模型，其中 y、x1 是连续变量，x2 是分类变量。

y <- rnorm(100, 2, 3)
x1 <- rnorm(100, 2.5, 2.8)
x2 <- factor(c(rep(1,45), rep(0,55)))

当 x2 = 0 和 x1 等于 x2 = 0 内的平均值时，我需要找到 y 的 95% 置信区间。

我做到了

mod <- lm(y ~ x1 * x2)

tapply(x1, x2, mean)
#       0        1 
#3.107850 2.294103 

pred.dat <- data.frame(x1 = 3.107850, x2 = "0")

predict(mod, pred.dat, interval = "confidence", level = 0.95)
#       fit      lwr      upr
#1 2.413393 1.626784 3.200003

predict(mod, pred.dat, interval = "prediction", level = 0.95)
#       fit       lwr      upr
#1 2.413393 -3.473052 8.299839

我想知道我这样做是否正确。我也想知道有没有比这更简单的方法

Answer 1

设置

set.seed(0)
y <- rnorm(100, 2, 3)
x1 <- rnorm(100, 2.5, 2.8)
x2 <- factor(c(rep(1,45), rep(0,55)))

mod <- lm(y ~ x1 * x2)

95% confidence intervals for y when x2 = 0 and x1 equals to the mean within x2 = 0.

I want to know whether I did this correctly or not.

您对 predict 的使用是正确的。

I want to know whether there is any easier way than this.

tapply可以跳过

pred.data <- data.frame(x1 = mean(x1[x2 == "0"]), x2 = "0")
#        x1 x2
#1 2.649924  0

或者你也可以

pred.data <- setNames(stack(tapply(x1, x2, mean)), c("x1", "x2"))
#        x1 x2
#1 2.649924  0
#2 2.033328  1

这样您就可以一次性获得两个因子水平的结果。

在每组的中心查找回归线的置信区间

Find confidence interval of a regression line at its center per group

regression

r

prediction

linear-regression

confidence-interval