使用 R 中的 JAGS 进行结果预测

Question

[代码已更新，不再对应错误信息]

我想了解 JAGS 如何预测结果值（对于混合马尔可夫模型）。我已经在包含结果 m 和协变量 x1、x2 和 x3.

的数据集上训练了模型

在不固定参数值的情况下预测结果在 R 中有效，但输出似乎完全随机：

preds <- run.jags("model.txt",
                  data=list(x1=x1, x2=x2, x3=x3, m=m,
                            statealpha=rep(1,times=M), M=M, T=T, N=N), monitor=c("m_pred"),
                  n.chains=1, inits = NA, sample=1)

Compiling rjags model... Calling the simulation using the rjags method... Note: the model did not require adaptation Burning in the model for 4000 iterations... |**************************************************| 100% Running the model for 1 iterations... Simulation complete Finished running the simulation

但是，一旦我尝试修复参数（即使用模型估计来预测结果 m，我就会收到错误消息：

preds <- run.jags("model.txt",
                  data=list(x1=x1, x2=x2, x3=x3,
                            statealpha=rep(1,times=M), M=M, T=T, N=N, beta1=beta1), monitor=c("m"),
                  n.chains=1, inits = NA, sample=1)

Compiling rjags model... Error: The following error occured when compiling and adapting the model using rjags: Error in rjags::jags.model(model, data = dataenv, n.chains = length(runjags.object$end.state), : RUNTIME ERROR: Compilation error on line 39. beta1[2,1] is a logical node and cannot be observed

beta1 在这种情况下是系数估计的 2x2 矩阵。

JAGS 如何预测第一个示例中的 m（无固定参数）？是完全随机选择m吗？
如何包含早期获得的模型估计来模拟新的结果值？

型号是：

model{
 for (i in 1:N)
 {

for (t in 1:T)
  {
  m[t,i] ~ dcat(ps[i,t,])
  }

for (state in 1:M)
  {
  ps[i,1,state] <- probs1[state]

  for (t in 2:T)
    {
    ps[i,t,state] <- probs[m[(t-1),i], state, i,t]
    }

  for (prev in 1:M){
       for (t in 1:T) {
    probs[prev,state,i,t] <- odds[prev,state,i,t]/totalodds[prev,i,t]
    odds[prev,state,i,t] <- exp(alpha[prev,state,i] +
                                beta1[prev,state]*x1[t,i]
                                + beta2[prev,state]*x2[t,i]
                               + beta3[prev,state]*x3[t,i])
    }}

  alpha[state,state,i] <- 0

      for (t in 1:T) {
  totalodds[state,i,t] <- odds[state,1,i,t] + odds[state,2,i,t]
  }
}
alpha[1,2,i] <- raneffs[i,1]
alpha[2,1,i] <- raneffs[i,2]
raneffs[i,1:2] ~ dmnorm(alpha.means[1:2],alpha.prec[1:2, 1:2])
}

for (state in 1:M)
  {
  beta1[state,state] <- 0
  beta2[state,state] <- 0
  beta3[state,state] <- 0
  }

beta1[1,2] <- rcoeff[1]
beta1[2,1] <- rcoeff[2]
beta2[1,2] <- rcoeff[3]
beta2[2,1] <- rcoeff[4]
beta3[1,2] <- rcoeff[5]
beta3[2,1] <- rcoeff[6]

alpha.Sigma[1:2,1:2] <- inverse(alpha.prec[1:2,1:2])
probs1[1:M] ~ ddirich(statealpha[1:M])
for (par in 1:6)
{
alpha.means[par] ~ dt(T.constant.mu,T.constant.tau,T.constant.k)
rcoeff[par] ~ dt(T.mu, T.tau, T.k)
}

T.constant.mu <- 0
T.mu <- 0
T.constant.tau <- 1/T.constant.scale.squared
T.tau <- 1/T.scale.squared
T.constant.scale.squared <- T.constant.scale*T.constant.scale
T.scale.squared <- T.scale*T.scale
T.scale <- 2.5
T.constant.scale <- 10
T.constant.k <- 1
T.k <- 1
alpha.prec[1:2,1:2] ~ dwish(Om[1:2,1:2],2)
Om[1,1] <- 1
Om[1,2] <- 0
Om[2,1] <- 0
Om[2,2] <- 1

## Prediction
for (i in 1:N)
    {

   m_pred[1,i] <- m[1,i]

    for (t in 2:T)  
      {
      m_pred[t,i] ~ dcat(ps_pred[i,t,])
      }

    for (state in 1:M)
      {
      ps_pred[i,1,state] <- probs1[state]

      for (t in 2:T)
        {
        ps_pred[i,t,state] <- probs_pred[m_pred[(t-1),i], state, i,t]
        }

      for (prev in 1:M)
        {

       for (t in 1:T)
         {
        probs_pred[prev,state,i,t] <- odds_pred[prev,state,i,t]/totalodds_pred[prev,i,t]
        odds_pred[prev,state,i,t] <- exp(alpha[prev,state,i] +
                                    beta1[prev,state]*x1[t,i]
                                    + beta2[prev,state]*x2[t,i]
                                   + beta3[prev,state]*x3[t,i])
        }}

      for (t in 1:T) {
      totalodds_pred[state,i,t] <- odds_pred[state,1,i,t] + odds_pred[state,2,i,t]
       }
      }
  }

Answer 1

TL;DR：我认为你只是错过了一个可能性。

您的模型很复杂，所以我可能遗漏了一些东西，但据我所知，这是不可能的。您提供预测变量 x1、x2 和 x3 作为数据，但您没有提供任何观察到的 m。那么 JAGS 在什么意义上可以成为 "fitting" 模型？

回答您的问题：

是的，m 似乎是从以模型其余部分为条件的分类分布中随机抽取的。由于没有 m 作为数据提供，none 的参数分布有更新的原因，因此 m 的结果与您随机抽取的结果没有什么不同来自所有先验，并通过 R 或其他任何模型传播它们。
尽管它在任何意义上仍不构成拟合模型，但您可以自由提供 beta1 的值（如果它们尚未在模型中完全定义）。 JAGS 正在抱怨，因为目前 beta1[i] = rcoeff[i] ~ dt(T.mu, T.tau, T.k)，并且 T 分布的参数都是固定的。如果 (T.mu, T.tau, T.k) 中的任何一个被赋予先验（将它们识别为随机），那么 beta1 可以作为数据提供，而 JAGS 会将 rcoeff[i] ~ dt(T.mu, T.tau, T.k) 视为可能性。但在模型的当前形式中，就 JAGS 而言，如果您提供 beta1 作为数据，这与模型中已有的固定定义相冲突。

我在这里伸展，但我的猜测是，如果您使用的是 JAGS，那么您也已经（或想要）在 JAGS 中拟合模型。在 jags 模型中包含观察到的响应和期望的预测响应是一种常见的模式，例如像这样：

model {
  b ~ dnorm(0, 1) # prior on b

  for(i in 1:N) {
    y[i] ~ dnorm(b * x[i], 1) # Likelihood of y | b (and fixed precision = 1 for the example)
  }

  for(i in 1:N_pred) {
    pred_y[i] ~ dnorm(b * pred_x[i], 1) # Prediction
  }
}

在此示例模型中，x、y 和 pred_x 作为数据提供，未知参数 b 将被估计，我们希望pred_x 的每个值的后验预测 pred_y。 JAGS 知道第一个 for 循环中的分布是一种可能性，因为 y 是作为数据提供的。 b 的后验样本将受此可能性的约束。第二个 for 循环看起来很相似，但是由于 pred_y 没有作为数据提供，所以它无法约束 b。相反，JAGS 知道简单地绘制 pred_y 以 b 和提供的 pred_x 为条件的样本。 pred_x 的值通常定义为与观察到的 x 相同，为每个观察到的数据点提供预测区间，或者作为沿 x 轴的规则值序列以生成平滑的预测区间.

使用 R 中的 JAGS 进行结果预测

Outcome prediction using JAGS from R

r

jags

runjags

rjags