模拟多项选择测试的相关答案

Question

我正在尝试模拟多项选择题测试 (MCQ) 的答案。目前，我正在使用以下代码来模拟只有两个问题的 MCQ 的答案：

answers <- data.frame(
Q1 = sample(LETTERS[1:5],10,replace = T, prob=c(0.1,0.6,0.1,0.1,0.1)),
Q2 = sample(LETTERS[1:5],10,replace = T, prob=c(0.5,0.1,0.1,0.2,0.1)))

答案B和A分别是Q1和Q2的正确答案。

我的困难是在问题的答案之间引入相关性，例如，一个好学生倾向于 select 所有问题的正确答案。我怎样才能做到这一点？

Answer 1

您可以用完全正确的答案填充数据，为每个学生分配一个熟练程度，然后根据他们的熟练程度随机更改他们考试中的值：

correct = c(2,1,3)
nstudents = 20
exam = matrix(LETTERS[rep(correct,nstudents)],ncol=length(correct),byrow=T)
colnames(exam)=paste("Q",1:length(correct),sep="")

proficiency = runif(nstudents,1,5)/5 ## Each student has a level of expertise

for(question in 1:length(correct)){
  difficulty = runif(nstudents,1,10)/10  ## Random difficulty for each question and student (may be made more or less difficult)
  nmistakes = sum(proficiency<difficulty)
  exam[,question][proficiency<difficulty] = sample(LETTERS[1:5],nmistakes,replace=T)
}

exam = as.data.frame(exam)

结果将是一个数据框，其中一些学生几乎从不犯错，而另一些学生几乎从不做对。

编辑：在这种情况下，熟练程度服从均匀分布。如果您需要它们正态分布，只需将 proficiency 向量更改为使用 rnorm().

Answer 2

这是一种使用 MASS::mvrnorm 应用协方差矩阵 Sigma= 的方法。

n <- 15
r <- .9
set.seed(42)
library('MASS')
M <- abs(mvrnorm(n=n, mu=c(1, 500), Sigma=matrix(c(1, r, r, 1), nrow=2), 
                empirical=TRUE)) |>
  as.data.frame() |>
  setNames(c('Q1', 'Q2'))

我们通过 cut 沿自定义 quantiles（取自 OP）

调整随机数来获得相关级别 A、...、B

f <- \(x, q) cut(x, breaks=c(0, quantile(x, cumsum(q))), include.lowest=T, 
                 labels=LETTERS[1:5])

p1 <- c(0.1, 0.6, 0.1, 0.1, 0.1)
p2 <- c(0.5, 0.1, 0.1, 0.2, 0.1)

在 Map() 通话中。

dat <- Map(f, M, list(p1, p2)) |>
  as.data.frame()
dat
#    Q1 Q2
# 1   A  A
# 2   B  A
# 3   E  E
# 4   D  E
# 5   A  A
# 6   B  A
# 7   C  D
# 8   B  A
# 9   B  A
# 10  B  A
# 11  B  C
# 12  B  B
# 13  E  D
# 14  B  A
# 15  C  D

检查

dat_check <- lapply(dat, as.integer) |> as.data.frame()
cor(dat_check)  ## correlation
#         Q1      Q2
# Q1 1.00000 0.85426
# Q2 0.85426 1.00000

lapply(dat, table)  ## students' answers
# $Q1
# 
# A B C D E 
# 2 8 2 1 2 
# 
# $Q2
# 
# A B C D E 
# 8 1 1 3 2

模拟多项选择测试的相关答案

Simulating correlated answers to a multi-choice test

simulation

r

correlation

dataframe