基于 stan 线性模型提取并添加到概率密度函数的数据值
Extract and add to the data values of the probability density function based on a stan linear model
鉴于下面的示例数据 sampleDT
和模型 lm.fit
和 brm.fit
,我想:
estimate, extract and add to the data frame the values of the density
function for a conditional normal distribution evaluated at the
observed level of the variable dollar.wage_1
.
我可以使用频率线性回归 lm.fit
和 dnorm
来做到这一点,但我尝试使用贝叶斯 brm.fit
模型来做同样的事情失败了。因此,我们将不胜感激。
##样本数据
sampleDT<-structure(list(id = 1:10, N = c(10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L), A = c(62L, 96L, 17L, 41L, 212L, 143L, 143L,
143L, 73L, 73L), B = c(3L, 1L, 0L, 2L, 170L, 21L, 0L, 33L, 62L,
17L), C = c(0.05, 0.01, 0, 0.05, 0.8, 0.15, 0, 0.23, 0.85, 0.23
), employer = c(1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L), F = c(0L,
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L), G = c(1.94, 1.19, 1.16,
1.16, 1.13, 1.13, 1.13, 1.13, 1.12, 1.12), H = c(0.14, 0.24,
0.28, 0.28, 0.21, 0.12, 0.17, 0.07, 0.14, 0.12), dollar.wage_1 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_2 = c(1.93,
1.18, 3.15, 3.15, 1.12, 1.12, 2.12, 1.12, 1.11, 1.11), dollar.wage_3 = c(1.95,
1.19, 3.16, 3.16, 1.14, 1.13, 2.13, 1.13, 1.13, 1.13), dollar.wage_4 = c(1.94,
1.18, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_5 = c(1.94,
1.19, 3.16, 3.16, 1.14, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_6 = c(1.94,
1.18, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_7 = c(1.94,
1.19, 3.16, 3.16, 1.14, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_8 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_9 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_10 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12)), row.names = c(NA,
-10L), class = "data.frame")
##frequentist 模型:这个有效
lm.fit <-lm(dollar.wage_1 ~ A + B + C + employer + F + G + H,
data=sampleDT)
sampleDT$dens1 <-dnorm(sampleDT$dollar.wage_1,mean=lm.fit$fitted,
sd=summary(lm.fit)$sigma)
##贝叶斯模型:这是我的尝试 - 它不起作用
//this works
brm.fit <-brm(dollar.wage_1 ~ A + B + C + employer + F + G + H,
data=sampleDT, iter = 4000, family = gaussian())
//this does not work
sampleDT$dens1_bayes <-dnorm(sampleDT$dollar.wage_1, mean = fitted(brm.fit), sd=summary(brm.fit)$sigma)
Error in dnorm(sampleDT$dollar.wage_1, mean = brm.fit$fitted, sd =
summary(brm.fit)$sigma) : Non-numeric argument to mathematical
function
在此先感谢您的帮助。
我们现在 fitted(brm.fit)
是一个矩阵,所以我们只想使用它的第一列 - 估计值。此外,由于对象结构没有理由相同,因此 summary(brm.fit)$sigma
什么也没有给出。相反,你想要 summary(brm.fit)$spec_pars[1]
。因此,您可以使用
sampleDT$dens1_bayes <- dnorm(sampleDT$dollar.wage_1,
mean = fitted(brm.fit)[, 1],
sd = summary(brm.fit)$spec_pars[1])
鉴于下面的示例数据 sampleDT
和模型 lm.fit
和 brm.fit
,我想:
estimate, extract and add to the data frame the values of the density function for a conditional normal distribution evaluated at the observed level of the variable
dollar.wage_1
.
我可以使用频率线性回归 lm.fit
和 dnorm
来做到这一点,但我尝试使用贝叶斯 brm.fit
模型来做同样的事情失败了。因此,我们将不胜感激。
##样本数据
sampleDT<-structure(list(id = 1:10, N = c(10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L), A = c(62L, 96L, 17L, 41L, 212L, 143L, 143L,
143L, 73L, 73L), B = c(3L, 1L, 0L, 2L, 170L, 21L, 0L, 33L, 62L,
17L), C = c(0.05, 0.01, 0, 0.05, 0.8, 0.15, 0, 0.23, 0.85, 0.23
), employer = c(1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L), F = c(0L,
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L), G = c(1.94, 1.19, 1.16,
1.16, 1.13, 1.13, 1.13, 1.13, 1.12, 1.12), H = c(0.14, 0.24,
0.28, 0.28, 0.21, 0.12, 0.17, 0.07, 0.14, 0.12), dollar.wage_1 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_2 = c(1.93,
1.18, 3.15, 3.15, 1.12, 1.12, 2.12, 1.12, 1.11, 1.11), dollar.wage_3 = c(1.95,
1.19, 3.16, 3.16, 1.14, 1.13, 2.13, 1.13, 1.13, 1.13), dollar.wage_4 = c(1.94,
1.18, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_5 = c(1.94,
1.19, 3.16, 3.16, 1.14, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_6 = c(1.94,
1.18, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_7 = c(1.94,
1.19, 3.16, 3.16, 1.14, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_8 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_9 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12), dollar.wage_10 = c(1.94,
1.19, 3.16, 3.16, 1.13, 1.13, 2.13, 1.13, 1.12, 1.12)), row.names = c(NA,
-10L), class = "data.frame")
##frequentist 模型:这个有效
lm.fit <-lm(dollar.wage_1 ~ A + B + C + employer + F + G + H,
data=sampleDT)
sampleDT$dens1 <-dnorm(sampleDT$dollar.wage_1,mean=lm.fit$fitted,
sd=summary(lm.fit)$sigma)
##贝叶斯模型:这是我的尝试 - 它不起作用
//this works
brm.fit <-brm(dollar.wage_1 ~ A + B + C + employer + F + G + H,
data=sampleDT, iter = 4000, family = gaussian())
//this does not work
sampleDT$dens1_bayes <-dnorm(sampleDT$dollar.wage_1, mean = fitted(brm.fit), sd=summary(brm.fit)$sigma)
Error in dnorm(sampleDT$dollar.wage_1, mean = brm.fit$fitted, sd = summary(brm.fit)$sigma) : Non-numeric argument to mathematical function
在此先感谢您的帮助。
我们现在 fitted(brm.fit)
是一个矩阵,所以我们只想使用它的第一列 - 估计值。此外,由于对象结构没有理由相同,因此 summary(brm.fit)$sigma
什么也没有给出。相反,你想要 summary(brm.fit)$spec_pars[1]
。因此,您可以使用
sampleDT$dens1_bayes <- dnorm(sampleDT$dollar.wage_1,
mean = fitted(brm.fit)[, 1],
sd = summary(brm.fit)$spec_pars[1])