通过 R 编程的 STAN IRT，参数声明问题？

Question

我正在关注这个官方 IRT w/ STAN tutorial。型号详情复制如下：

data {
  int<lower=1> J;              // number of students
  int<lower=1> K;              // number of questions
  int<lower=1> N;              // number of observations
  int<lower=1,upper=J> jj[N];  // student for observation n
  int<lower=1,upper=K> kk[N];  // question for observation n
  int<lower=0,upper=1> y[N];   // correctness for observation n
}

parameters {
  real delta;         // mean student ability
  real alpha[J];      // ability of student j - mean ability
  real beta[K];       // difficulty of question k
}

model {
  alpha ~ std_normal();         // informative true prior
  beta ~ std_normal();          // informative true prior
  delta ~ normal(0.75, 1);      // informative true prior
  for (n in 1:N)
    y[n] ~ bernoulli_logit(alpha[jj[n]] - beta[kk[n]] + delta);
}

我不确定哪些变量需要在 R 代码中声明，哪些不需要。

toy_data <- list(
  J= 5,
  K = 4,
  N =20,
  y= c(1,1,1,1,1,1,1,0,1,1,0,0,1,0,0,0,0,0,0,0)
                )
fit <- stan(file = '1PL_stan.stan', data = toy_data)

但是，触发了以下错误。

Error in mod$fit_ptr() : 
  Exception: variable does not exist; processing stage=data initialization; variable name=jj; base type=int  (in 'model920c4330dff_1PL_stan' at line 5)

In addition: Warning messages:
1: In readLines(file, warn = TRUE) :
  incomplete final line found on 'C:\Users\jacob.moore\DownloadsPL_stan.stan'
2: In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) :
  '-E' not found
failed to create the sampler; sampling not done

在我过去的工作中，我几乎只使用 python。所以学习 R 的过程非常曲折；此外，我是 STAN 的新手，因此是玩具示例。

核心思想是有20对child/question。 5 children 和 4 个不同的问题。我不确定为什么我的代码会触发错误，以及我应该如何纠正错误。您能否阐明需要将此代码调整为运行而不会触发错误的内容？

Answer 1

数据块中列出的每个参数（J、K、N、jj、kk 和 y）需要包含在变量 toy_data 中。您遗漏了 jj 和 kk.

您有 5 名学生 (J=5)，每人回答 4 个问题 (K=4)。 jj 是学生 ID，kk 是问题 ID，因此假设您的回答按学生顺序排列，然后按问题顺序排列，您会得到类似于

jj = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5)
kk = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)

通过 R 编程的 STAN IRT，参数声明问题？

STAN IRT via R programming, issue with parameter declaration?

r

bayesian

stan