r lm 参数估计

Question

错误变量长度不同

我对这个错误很困惑，不知道该怎么办。

n1<-20
m1<-0
sd1<-1
y<-rnorm(n1,m1, sd1)
x<-rnorm(n1,m1, sd1)
e<-rnorm(n1,m1, sd1)
b0<-0
b1<-1

modelfit1<-lm(y~ b0 + b1*x + e)
Error in model.frame.default(formula = y ~ b0 + b1 * x + e:
variable lengths differ (found for 'b0')

已编辑：我正在处理这样的情况，其中 n=20，参数 b0=0 和 b=1 为真，并且独立项和误差正态分布为 mean=0 和 sd=1。这可能吗？

非常感谢！

Answer 1

我建议你把所有东西都放在一个 data.frame 里然后这样处理:

set.seed(2)
m1<-0
sd1<-1
y<-rnorm(n1,m1, sd1)
x<-rnorm(n1,m1, sd1)
b0<-0
b1<-1

d <- data.frame(y,b0,b1,x,e=rnorm(20,0,1))
head(d)
#             y b0 b1            x          e
# 1 -0.89691455  0  1  2.090819205 -0.3835862
# 2  0.18484918  0  1 -1.199925820 -1.9591032
# 3  1.58784533  0  1  1.589638200 -0.8417051
# 4 -1.13037567  0  1  1.954651642  1.9035475
# 5 -0.08025176  0  1  0.004937777  0.6224939
# 6  0.13242028  0  1 -2.451706388  1.9909204

现在一切正常：

modelfit1 <- lm(y~b0+b1*x+e, data=d)
modelfit1
# Call:
# lm(formula = y ~ b0 + b1 * x + e, data = d)
# Coefficients:
# (Intercept)           b0           b1            x            e         b1:x  
#     0.19331           NA           NA     -0.06752      0.02240           NA  
summary(modelfit1)
# Call:
# lm(formula = y ~ b0 + b1 * x + e, data = d)
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -2.5006 -0.4786 -0.1425  0.6211  1.8488 
# Coefficients: (3 not defined because of singularities)
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  0.19331    0.25013   0.773    0.450
# b0                NA         NA      NA       NA
# b1                NA         NA      NA       NA
# x           -0.06752    0.21720  -0.311    0.760
# e            0.02240    0.20069   0.112    0.912
# b1:x              NA         NA      NA       NA
# Residual standard error: 1.115 on 17 degrees of freedom
# Multiple R-squared:  0.006657,    Adjusted R-squared:  -0.1102 
# F-statistic: 0.05697 on 2 and 17 DF,  p-value: 0.9448

Answer 2

我可能是错的，但我相信您想模拟一个结果然后估计它的参数。如果是这样，您宁愿执行以下操作：

n1 <- 20
m1 <- 0
sd1<- 1
b0 <- 0
b1 <- 1

x <- rnorm(n1,m1, sd1)
e <- rnorm(n1,m1, sd1)


y <- b0 + b1*x + e
summary(lm(y~x))

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.66052 -0.40203  0.05659  0.44115  1.38798 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.3078     0.1951  -1.578    0.132    
x             1.1774     0.2292   5.137  6.9e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.852 on 18 degrees of freedom
Multiple R-squared:  0.5945,    Adjusted R-squared:  0.572 
F-statistic: 26.39 on 1 and 18 DF,  p-value: 6.903e-05

如果您想多次执行此操作，请考虑以下事项：

repetitions <- 5
betas <- t(sapply(1:repetitions, function(i){
  y <- b0 + b1*x + rnorm(n1,m1, sd1)
  coefficients(lm(y~x))
  }))
betas
     (Intercept)         x
[1,]  0.21989182 0.8185690
[2,] -0.12820726 0.7289041
[3,] -0.27596844 0.9794432
[4,]  0.06145306 1.0575050
[5,] -0.31429950 0.9984262

现在您可以查看估计 beta 的平均值：

colMeans(betas)
(Intercept)           x 
-0.08742606  0.91656951

和方差-协方差矩阵：

var(betas)
             (Intercept)            x
(Intercept)  0.051323041 -0.007976803
x           -0.007976803  0.018834711

r lm 参数估计

r lm parameter estimates

parameters

r

lm