jags.model 节点错误与父节点不一致
Error in jags.model node inconsistent with parents
我正在使用 beta-binomial n-mixture 模型来估计丰度。我拥有的代码在模拟和自然数据集上都运行良好。但是,当我 运行 一个特定的数据集时,这个错误弹出 Error in jags.model(etc...) Error in node N[1] Node inconsistent with parents. 我想知道为什么这个数据集有问题而不是其他数据集。 数据集中有相当数量的零不起作用,但还有更多确实有效的数据集也有相当数量的 0。先验值我玩过,范围从0.001到10。数据结构和模型如下
n.site <- 6
R <- n.site #number of sites
T <- 10 #number of replicate counts
y <- array(dim = c(R, T)) #null array
y <- array(sur.2019$Num_total, dim = c(R, T)) #populate y
C <- c(y)
C <- as.numeric(C)
site = 1:R
site.p <- rep(site, T)
sink("Model.txt")
cat("
model {
# Priors
lam~dgamma(0.01,0.01)
alpha.p~dgamma(0.01,0.01)
beta.p~dgamma(0.01,0.01)
# Likelihood
#the next four lines are used to model N as a zero-truncated distribution
probs[R+1]<- 1-sum(probs[1:R])
for(i in 1:R){
probs[i]<- exp(-lam)*(pow(lam,x[i]))/(exp(logfact(x[i])) * (1-exp(-lam)))
N[i] ~ dcat(probs[])}
# Observation model for replicated counts
for (i in 1:n) {
C[i] ~ dbin(p[i], N[site.p[i]])
p[i]~dbeta(alpha.p,beta.p)
# Assess model fit using Chi-squared discrepancy
# Compute fit statistic for observed data
eval[i] <- p[i]*N[site.p[i]]
E[i] <- pow((C[i] - eval[i]),2) / (eval[i] + 0.5)
# Generate replicate data and compute fit stats
C.new[i] ~ dbin(p[i], N[site.p[i]])
E.new[i] <- pow((C.new[i] - eval[i]),2) / (eval[i] + 0.5)
} # ends i loop
# Derived and other quantities
totalN <- sum(N[]) # Estimate abundance across all sites
mean.abundance <- lam #mean expected abundance per plot
p.derived<-alpha.p/(alpha.p+beta.p) #derived detection probability
rho.derived<-1/(alpha.p+beta.p+1) #correlation coefficient
fit <- sum(E[])
fit.new <- sum(E.new[])
}
",fill = TRUE)
sink()
R = nrow(y)
T = ncol(y)
n = dim(y)[1] * dim(y)[2]#number of observations (sites*surveys)
nmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
# Initial values
Nst <- apply(y, 1, max) + 1 #changed from apply(y, 1, max) + 1
Nst[is.na(Nst)] <- 1
inits <- function(){list(N = Nst, lam = runif(1, 1, 7),alpha.p=runif(1,0.5,1), beta.p=runif(1,0.5,1))}
# Define parameters to be monitored
params <- c("totalN", "mean.abundance", "lam", "p.derived", "rho.derived", "fit", "fit.new","alpha.p","beta.p")
# MCMC settings
ni <- 14000
nt <- 1
nb <- 4000
nc <- 3
abun.1 <- jags(data = nmm.data,
parameters = params,
inits = inits,
model = "model.txt",
n.thin = nt,
n.chains = nc,
n.burnin = nb,
n.iter = ni)
abun.1.mcmc <- as.mcmc(abun.1)
不工作的数据集看起来像这样
nmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
nmm.data
$C
[1] 7 0 1 0 0 0 2 0 0 0 0 1 3 2 0 0 0 0 1 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[55] 0 0 0 0 0 0
$n
[1] 60
$R
[1] 6
$site.p
[1] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[55] 1 2 3 4 5 6
$x
[1] 1 2 3 4 5 6
虽然工作的数据集看起来像这样
hmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
hmm.data
$C
[1] 0 0 0 0 0 2 0 1 0 0 0 0 0 3 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0
[55] 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[109] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[163] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[217] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
$n
[1] 60
$R
[1] 6
$site.p
[1] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[55] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[109] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[163] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[217] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
$x
[1] 1 2 3 4 5 6
正如我所说,我已经尝试使用先验知识,但仍然在节点 N[1] 处遇到问题。我是 JAGS 的新手,不确定如何解决这个问题,因为它似乎不是编码问题,而是数据问题。任何人有任何想法可能会发生什么?我感谢任何人花时间查看它。
nmm.data 的输入是
list(C = c(7, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0), n = 60L, R = 6L, site.p = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L,
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L,
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L), x = 1:6)
hmm.data 是
list(C = c(0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 3, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0), n = 258L, R = 6L, site.p = c(1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L), x = 1:6)
仔细检查您的代码后,我能够使用 nmm.data
训练贝叶斯模型。我发现了两个问题。您的先验可能会产生非常小的值,这会在模型中产生问题。你可以改变它,但我没有那样做。第二个问题是你的初始值。它们太大了,这会影响链条的性能。由于您的先验值较小且初始值较大,因此使用大量迭代会弄乱一切。所以,当我使用麻烦的数据集时,我所做的是减少迭代次数(而且你的数据很小,当你有大数据时,如另一个列表,问题不会出现)。代码在这里,在最后一部分我会给你一些提取结果的选项。
首先是模型(没有变化,但添加了固定的随机种子以保持结果相似):
library(rjags)
set.seed(123)
#Code for model
mymod <- "
model {
# Priors
lam~dgamma(0.01,0.01)
alpha.p~dgamma(0.01,0.01)
beta.p~dgamma(0.01,0.01)
# Likelihood
#the next four lines are used to model N as a zero-truncated distribution
probs[R+1]<- 1-sum(probs[1:R])
for(i in 1:R){
probs[i]<- exp(-lam)*(pow(lam,x[i]))/(exp(logfact(x[i])) * (1-exp(-lam)))
N[i] ~ dcat(probs[])}
# Observation model for replicated counts
for (i in 1:n) {
C[i] ~ dbin(p[i], N[site.p[i]])
p[i]~dbeta(alpha.p,beta.p)
# Assess model fit using Chi-squared discrepancy
# Compute fit statistic for observed data
eval[i] <- p[i]*N[site.p[i]]
E[i] <- pow((C[i] - eval[i]),2) / (eval[i] + 0.5)
# Generate replicate data and compute fit stats
C.new[i] ~ dbin(p[i], N[site.p[i]])
E.new[i] <- pow((C.new[i] - eval[i]),2) / (eval[i] + 0.5)
} # ends i loop
# Derived and other quantities
totalN <- sum(N[]) # Estimate abundance across all sites
mean.abundance <- lam #mean expected abundance per plot
p.derived<-alpha.p/(alpha.p+beta.p) #derived detection probability
rho.derived<-1/(alpha.p+beta.p+1) #correlation coefficient
fit <- sum(E[])
fit.new <- sum(E.new[])
}
"
现在,我们将定义与您相同的值和初始值函数(我稍微调整了函数):
#Data
#Values
n.site <- 6
R <- n.site #number of sites
T <- 10 #number of replicate counts
y <- array(dim = c(R, T)) #null array
# Initial values
Nst <- apply(y, 1, max) + 1 #changed from apply(y, 1, max) + 1
Nst[is.na(Nst)] <- 1
inits <- function(){list(N = Nst, lam = runif(1, 1, 7),
alpha.p=runif(1,0.1,0.5),
beta.p=runif(1,0.5,1))}
接下来,我们将定义模型的设置并对其进行训练(我没有使用其中一些值,但您可以尝试不同的设置):
# MCMC settings using nmm.data
ni <- 14000
nt <- 1
nb <- 4000
nc <- 3
#Model
m1 <- jags.model(file=textConnection(mymod),
data=nmm.data,n.chains=3,inits=inits(),quiet = T)
#Update
update(m1, n.iter=1000,progress.bar = "none")
使用前面的代码,模型已经训练和更新。由于数据中的值较小,我减少了迭代次数。增加该数字可以生成无限值,训练模型过程将停止。所以,现在我们的模型已经准备好了。我们将使用相同的函数 as.mcmc()
来创建您想要的对象,并使用 coda.samples()
函数从每个链中提取参数:
#Parameters
params <- c("totalN", "mean.abundance", "lam", "p.derived",
"rho.derived", "fit", "fit.new","alpha.p","beta.p")
#Extract results
res1 <- coda.samples(m1,variable.names=params,n.iter=1000)
#Code for extracting
m1.mcmc <- as.mcmc(m1)
#Example
d1 <- as.data.frame(res1[[1]])
您将创建对象 m1.mcmc
,并且在 res1
中还将保存每个参数链的结果。我添加了示例 d1
,其输出是下一个(只有一些行 head(d1)
因为输出很大):
alpha.p beta.p fit fit.new lam mean.abundance p.derived rho.derived totalN
1 0.04932755 0.5086096 3.390977 3.765055 2.517308 2.517308 0.08841059 0.6418744 16
2 0.06047980 0.5160444 2.218640 3.986287 2.453345 2.453345 0.10490418 0.6343068 18
3 0.05784805 0.4795353 1.286302 2.862089 3.213626 3.213626 0.10764764 0.6504558 17
4 0.05672757 0.3837686 2.980358 2.260288 1.679854 1.679854 0.12878106 0.6942052 15
5 0.05653906 0.8254216 6.293583 2.671746 3.382379 3.382379 0.06410611 0.5313607 15
6 0.06015773 0.5851066 5.566315 2.555359 5.338035 5.338035 0.09322960 0.6078051 26
因此,当您使用较大的数据时,链会收敛,而使用较小的数据时 nmm.data
请注意初始值和迭代次数。
我正在使用 beta-binomial n-mixture 模型来估计丰度。我拥有的代码在模拟和自然数据集上都运行良好。但是,当我 运行 一个特定的数据集时,这个错误弹出 Error in jags.model(etc...) Error in node N[1] Node inconsistent with parents. 我想知道为什么这个数据集有问题而不是其他数据集。 数据集中有相当数量的零不起作用,但还有更多确实有效的数据集也有相当数量的 0。先验值我玩过,范围从0.001到10。数据结构和模型如下
n.site <- 6
R <- n.site #number of sites
T <- 10 #number of replicate counts
y <- array(dim = c(R, T)) #null array
y <- array(sur.2019$Num_total, dim = c(R, T)) #populate y
C <- c(y)
C <- as.numeric(C)
site = 1:R
site.p <- rep(site, T)
sink("Model.txt")
cat("
model {
# Priors
lam~dgamma(0.01,0.01)
alpha.p~dgamma(0.01,0.01)
beta.p~dgamma(0.01,0.01)
# Likelihood
#the next four lines are used to model N as a zero-truncated distribution
probs[R+1]<- 1-sum(probs[1:R])
for(i in 1:R){
probs[i]<- exp(-lam)*(pow(lam,x[i]))/(exp(logfact(x[i])) * (1-exp(-lam)))
N[i] ~ dcat(probs[])}
# Observation model for replicated counts
for (i in 1:n) {
C[i] ~ dbin(p[i], N[site.p[i]])
p[i]~dbeta(alpha.p,beta.p)
# Assess model fit using Chi-squared discrepancy
# Compute fit statistic for observed data
eval[i] <- p[i]*N[site.p[i]]
E[i] <- pow((C[i] - eval[i]),2) / (eval[i] + 0.5)
# Generate replicate data and compute fit stats
C.new[i] ~ dbin(p[i], N[site.p[i]])
E.new[i] <- pow((C.new[i] - eval[i]),2) / (eval[i] + 0.5)
} # ends i loop
# Derived and other quantities
totalN <- sum(N[]) # Estimate abundance across all sites
mean.abundance <- lam #mean expected abundance per plot
p.derived<-alpha.p/(alpha.p+beta.p) #derived detection probability
rho.derived<-1/(alpha.p+beta.p+1) #correlation coefficient
fit <- sum(E[])
fit.new <- sum(E.new[])
}
",fill = TRUE)
sink()
R = nrow(y)
T = ncol(y)
n = dim(y)[1] * dim(y)[2]#number of observations (sites*surveys)
nmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
# Initial values
Nst <- apply(y, 1, max) + 1 #changed from apply(y, 1, max) + 1
Nst[is.na(Nst)] <- 1
inits <- function(){list(N = Nst, lam = runif(1, 1, 7),alpha.p=runif(1,0.5,1), beta.p=runif(1,0.5,1))}
# Define parameters to be monitored
params <- c("totalN", "mean.abundance", "lam", "p.derived", "rho.derived", "fit", "fit.new","alpha.p","beta.p")
# MCMC settings
ni <- 14000
nt <- 1
nb <- 4000
nc <- 3
abun.1 <- jags(data = nmm.data,
parameters = params,
inits = inits,
model = "model.txt",
n.thin = nt,
n.chains = nc,
n.burnin = nb,
n.iter = ni)
abun.1.mcmc <- as.mcmc(abun.1)
不工作的数据集看起来像这样
nmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
nmm.data
$C
[1] 7 0 1 0 0 0 2 0 0 0 0 1 3 2 0 0 0 0 1 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[55] 0 0 0 0 0 0
$n
[1] 60
$R
[1] 6
$site.p
[1] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[55] 1 2 3 4 5 6
$x
[1] 1 2 3 4 5 6
虽然工作的数据集看起来像这样
hmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
hmm.data
$C
[1] 0 0 0 0 0 2 0 1 0 0 0 0 0 3 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0
[55] 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[109] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[163] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[217] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
$n
[1] 60
$R
[1] 6
$site.p
[1] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[55] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[109] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[163] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[217] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
$x
[1] 1 2 3 4 5 6
正如我所说,我已经尝试使用先验知识,但仍然在节点 N[1] 处遇到问题。我是 JAGS 的新手,不确定如何解决这个问题,因为它似乎不是编码问题,而是数据问题。任何人有任何想法可能会发生什么?我感谢任何人花时间查看它。
nmm.data 的输入是
list(C = c(7, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0), n = 60L, R = 6L, site.p = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L,
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L,
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L), x = 1:6)
hmm.data 是
list(C = c(0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 3, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0), n = 258L, R = 6L, site.p = c(1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L,
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L,
6L), x = 1:6)
仔细检查您的代码后,我能够使用 nmm.data
训练贝叶斯模型。我发现了两个问题。您的先验可能会产生非常小的值,这会在模型中产生问题。你可以改变它,但我没有那样做。第二个问题是你的初始值。它们太大了,这会影响链条的性能。由于您的先验值较小且初始值较大,因此使用大量迭代会弄乱一切。所以,当我使用麻烦的数据集时,我所做的是减少迭代次数(而且你的数据很小,当你有大数据时,如另一个列表,问题不会出现)。代码在这里,在最后一部分我会给你一些提取结果的选项。
首先是模型(没有变化,但添加了固定的随机种子以保持结果相似):
library(rjags)
set.seed(123)
#Code for model
mymod <- "
model {
# Priors
lam~dgamma(0.01,0.01)
alpha.p~dgamma(0.01,0.01)
beta.p~dgamma(0.01,0.01)
# Likelihood
#the next four lines are used to model N as a zero-truncated distribution
probs[R+1]<- 1-sum(probs[1:R])
for(i in 1:R){
probs[i]<- exp(-lam)*(pow(lam,x[i]))/(exp(logfact(x[i])) * (1-exp(-lam)))
N[i] ~ dcat(probs[])}
# Observation model for replicated counts
for (i in 1:n) {
C[i] ~ dbin(p[i], N[site.p[i]])
p[i]~dbeta(alpha.p,beta.p)
# Assess model fit using Chi-squared discrepancy
# Compute fit statistic for observed data
eval[i] <- p[i]*N[site.p[i]]
E[i] <- pow((C[i] - eval[i]),2) / (eval[i] + 0.5)
# Generate replicate data and compute fit stats
C.new[i] ~ dbin(p[i], N[site.p[i]])
E.new[i] <- pow((C.new[i] - eval[i]),2) / (eval[i] + 0.5)
} # ends i loop
# Derived and other quantities
totalN <- sum(N[]) # Estimate abundance across all sites
mean.abundance <- lam #mean expected abundance per plot
p.derived<-alpha.p/(alpha.p+beta.p) #derived detection probability
rho.derived<-1/(alpha.p+beta.p+1) #correlation coefficient
fit <- sum(E[])
fit.new <- sum(E.new[])
}
"
现在,我们将定义与您相同的值和初始值函数(我稍微调整了函数):
#Data
#Values
n.site <- 6
R <- n.site #number of sites
T <- 10 #number of replicate counts
y <- array(dim = c(R, T)) #null array
# Initial values
Nst <- apply(y, 1, max) + 1 #changed from apply(y, 1, max) + 1
Nst[is.na(Nst)] <- 1
inits <- function(){list(N = Nst, lam = runif(1, 1, 7),
alpha.p=runif(1,0.1,0.5),
beta.p=runif(1,0.5,1))}
接下来,我们将定义模型的设置并对其进行训练(我没有使用其中一些值,但您可以尝试不同的设置):
# MCMC settings using nmm.data
ni <- 14000
nt <- 1
nb <- 4000
nc <- 3
#Model
m1 <- jags.model(file=textConnection(mymod),
data=nmm.data,n.chains=3,inits=inits(),quiet = T)
#Update
update(m1, n.iter=1000,progress.bar = "none")
使用前面的代码,模型已经训练和更新。由于数据中的值较小,我减少了迭代次数。增加该数字可以生成无限值,训练模型过程将停止。所以,现在我们的模型已经准备好了。我们将使用相同的函数 as.mcmc()
来创建您想要的对象,并使用 coda.samples()
函数从每个链中提取参数:
#Parameters
params <- c("totalN", "mean.abundance", "lam", "p.derived",
"rho.derived", "fit", "fit.new","alpha.p","beta.p")
#Extract results
res1 <- coda.samples(m1,variable.names=params,n.iter=1000)
#Code for extracting
m1.mcmc <- as.mcmc(m1)
#Example
d1 <- as.data.frame(res1[[1]])
您将创建对象 m1.mcmc
,并且在 res1
中还将保存每个参数链的结果。我添加了示例 d1
,其输出是下一个(只有一些行 head(d1)
因为输出很大):
alpha.p beta.p fit fit.new lam mean.abundance p.derived rho.derived totalN
1 0.04932755 0.5086096 3.390977 3.765055 2.517308 2.517308 0.08841059 0.6418744 16
2 0.06047980 0.5160444 2.218640 3.986287 2.453345 2.453345 0.10490418 0.6343068 18
3 0.05784805 0.4795353 1.286302 2.862089 3.213626 3.213626 0.10764764 0.6504558 17
4 0.05672757 0.3837686 2.980358 2.260288 1.679854 1.679854 0.12878106 0.6942052 15
5 0.05653906 0.8254216 6.293583 2.671746 3.382379 3.382379 0.06410611 0.5313607 15
6 0.06015773 0.5851066 5.566315 2.555359 5.338035 5.338035 0.09322960 0.6078051 26
因此,当您使用较大的数据时,链会收敛,而使用较小的数据时 nmm.data
请注意初始值和迭代次数。