r lm 参数估计
r lm parameter estimates
错误变量长度不同
我对这个错误很困惑,不知道该怎么办。
n1<-20
m1<-0
sd1<-1
y<-rnorm(n1,m1, sd1)
x<-rnorm(n1,m1, sd1)
e<-rnorm(n1,m1, sd1)
b0<-0
b1<-1
modelfit1<-lm(y~ b0 + b1*x + e)
Error in model.frame.default(formula = y ~ b0 + b1 * x + e:
variable lengths differ (found for 'b0')
已编辑:
我正在处理这样的情况,其中 n=20,参数 b0=0 和 b=1 为真,并且独立项和误差正态分布为 mean=0 和 sd=1。
这可能吗?
非常感谢!
我建议你把所有东西都放在一个 data.frame
里然后这样处理:
set.seed(2)
m1<-0
sd1<-1
y<-rnorm(n1,m1, sd1)
x<-rnorm(n1,m1, sd1)
b0<-0
b1<-1
d <- data.frame(y,b0,b1,x,e=rnorm(20,0,1))
head(d)
# y b0 b1 x e
# 1 -0.89691455 0 1 2.090819205 -0.3835862
# 2 0.18484918 0 1 -1.199925820 -1.9591032
# 3 1.58784533 0 1 1.589638200 -0.8417051
# 4 -1.13037567 0 1 1.954651642 1.9035475
# 5 -0.08025176 0 1 0.004937777 0.6224939
# 6 0.13242028 0 1 -2.451706388 1.9909204
现在一切正常:
modelfit1 <- lm(y~b0+b1*x+e, data=d)
modelfit1
# Call:
# lm(formula = y ~ b0 + b1 * x + e, data = d)
# Coefficients:
# (Intercept) b0 b1 x e b1:x
# 0.19331 NA NA -0.06752 0.02240 NA
summary(modelfit1)
# Call:
# lm(formula = y ~ b0 + b1 * x + e, data = d)
# Residuals:
# Min 1Q Median 3Q Max
# -2.5006 -0.4786 -0.1425 0.6211 1.8488
# Coefficients: (3 not defined because of singularities)
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.19331 0.25013 0.773 0.450
# b0 NA NA NA NA
# b1 NA NA NA NA
# x -0.06752 0.21720 -0.311 0.760
# e 0.02240 0.20069 0.112 0.912
# b1:x NA NA NA NA
# Residual standard error: 1.115 on 17 degrees of freedom
# Multiple R-squared: 0.006657, Adjusted R-squared: -0.1102
# F-statistic: 0.05697 on 2 and 17 DF, p-value: 0.9448
我可能是错的,但我相信您想模拟一个结果然后估计它的参数。如果是这样,您宁愿执行以下操作:
n1 <- 20
m1 <- 0
sd1<- 1
b0 <- 0
b1 <- 1
x <- rnorm(n1,m1, sd1)
e <- rnorm(n1,m1, sd1)
y <- b0 + b1*x + e
summary(lm(y~x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.66052 -0.40203 0.05659 0.44115 1.38798
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3078 0.1951 -1.578 0.132
x 1.1774 0.2292 5.137 6.9e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.852 on 18 degrees of freedom
Multiple R-squared: 0.5945, Adjusted R-squared: 0.572
F-statistic: 26.39 on 1 and 18 DF, p-value: 6.903e-05
如果您想多次执行此操作,请考虑以下事项:
repetitions <- 5
betas <- t(sapply(1:repetitions, function(i){
y <- b0 + b1*x + rnorm(n1,m1, sd1)
coefficients(lm(y~x))
}))
betas
(Intercept) x
[1,] 0.21989182 0.8185690
[2,] -0.12820726 0.7289041
[3,] -0.27596844 0.9794432
[4,] 0.06145306 1.0575050
[5,] -0.31429950 0.9984262
现在您可以查看估计 beta 的平均值:
colMeans(betas)
(Intercept) x
-0.08742606 0.91656951
和方差-协方差矩阵:
var(betas)
(Intercept) x
(Intercept) 0.051323041 -0.007976803
x -0.007976803 0.018834711
错误变量长度不同
我对这个错误很困惑,不知道该怎么办。
n1<-20
m1<-0
sd1<-1
y<-rnorm(n1,m1, sd1)
x<-rnorm(n1,m1, sd1)
e<-rnorm(n1,m1, sd1)
b0<-0
b1<-1
modelfit1<-lm(y~ b0 + b1*x + e)
Error in model.frame.default(formula = y ~ b0 + b1 * x + e:
variable lengths differ (found for 'b0')
已编辑: 我正在处理这样的情况,其中 n=20,参数 b0=0 和 b=1 为真,并且独立项和误差正态分布为 mean=0 和 sd=1。 这可能吗?
非常感谢!
我建议你把所有东西都放在一个 data.frame
里然后这样处理:
set.seed(2)
m1<-0
sd1<-1
y<-rnorm(n1,m1, sd1)
x<-rnorm(n1,m1, sd1)
b0<-0
b1<-1
d <- data.frame(y,b0,b1,x,e=rnorm(20,0,1))
head(d)
# y b0 b1 x e
# 1 -0.89691455 0 1 2.090819205 -0.3835862
# 2 0.18484918 0 1 -1.199925820 -1.9591032
# 3 1.58784533 0 1 1.589638200 -0.8417051
# 4 -1.13037567 0 1 1.954651642 1.9035475
# 5 -0.08025176 0 1 0.004937777 0.6224939
# 6 0.13242028 0 1 -2.451706388 1.9909204
现在一切正常:
modelfit1 <- lm(y~b0+b1*x+e, data=d)
modelfit1
# Call:
# lm(formula = y ~ b0 + b1 * x + e, data = d)
# Coefficients:
# (Intercept) b0 b1 x e b1:x
# 0.19331 NA NA -0.06752 0.02240 NA
summary(modelfit1)
# Call:
# lm(formula = y ~ b0 + b1 * x + e, data = d)
# Residuals:
# Min 1Q Median 3Q Max
# -2.5006 -0.4786 -0.1425 0.6211 1.8488
# Coefficients: (3 not defined because of singularities)
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.19331 0.25013 0.773 0.450
# b0 NA NA NA NA
# b1 NA NA NA NA
# x -0.06752 0.21720 -0.311 0.760
# e 0.02240 0.20069 0.112 0.912
# b1:x NA NA NA NA
# Residual standard error: 1.115 on 17 degrees of freedom
# Multiple R-squared: 0.006657, Adjusted R-squared: -0.1102
# F-statistic: 0.05697 on 2 and 17 DF, p-value: 0.9448
我可能是错的,但我相信您想模拟一个结果然后估计它的参数。如果是这样,您宁愿执行以下操作:
n1 <- 20
m1 <- 0
sd1<- 1
b0 <- 0
b1 <- 1
x <- rnorm(n1,m1, sd1)
e <- rnorm(n1,m1, sd1)
y <- b0 + b1*x + e
summary(lm(y~x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.66052 -0.40203 0.05659 0.44115 1.38798
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3078 0.1951 -1.578 0.132
x 1.1774 0.2292 5.137 6.9e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.852 on 18 degrees of freedom
Multiple R-squared: 0.5945, Adjusted R-squared: 0.572
F-statistic: 26.39 on 1 and 18 DF, p-value: 6.903e-05
如果您想多次执行此操作,请考虑以下事项:
repetitions <- 5
betas <- t(sapply(1:repetitions, function(i){
y <- b0 + b1*x + rnorm(n1,m1, sd1)
coefficients(lm(y~x))
}))
betas
(Intercept) x
[1,] 0.21989182 0.8185690
[2,] -0.12820726 0.7289041
[3,] -0.27596844 0.9794432
[4,] 0.06145306 1.0575050
[5,] -0.31429950 0.9984262
现在您可以查看估计 beta 的平均值:
colMeans(betas)
(Intercept) x
-0.08742606 0.91656951
和方差-协方差矩阵:
var(betas)
(Intercept) x
(Intercept) 0.051323041 -0.007976803
x -0.007976803 0.018834711