在 R 中模拟损耗

Simulate attrition in R

我模拟了以下一般“调查实验”数据:

n <- 100
df <- data.frame(
Q1 = sample(c(18:90), n, rep = TRUE), #age
Q2 = sample(c("m", "f"), n, rep = TRUE), #sex
Q3 = sample(c(0,1), n, rep = TRUE, prob = c(0.55, 0.45)), #other general pre-treatment questions
Q4 = sample(c(0,1), n, rep = TRUE),
Q5 = sample(c(0,1), n, rep = TRUE), #treatment
Q6 = sample(c(0,1), n, rep = TRUE), #post-treatment
Q7 = sample(c(0,1), n, rep = TRUE),
Q8 = sample(c(0,1), n, rep = TRUE),
Q9 = sample(c(0,1), n, rep = TRUE),
Q10 = sample(c(0,1), n, rep = TRUE))

我想随机模拟损耗 (NA) 数据。以下查询处理类似的问题:How do I add random `NA`s into a data frame

但是,我对生成模拟完全离开调查的受访者的数据很感兴趣,这可能看起来像这样:

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
18  m  1  0  NA NA NA NA NA NA
30  f NA  NA NA NA NA NA NA NA
25  f  1  0  1  0  NA NA NA NA

谢谢!

Base R,

invisible(
sapply(1:nrow(df),function(x) {
    a <- sample(3:10,1)
    df[x,a:ncol(df)] <<- NA
}
))

head(df)

给予,

  Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
1 29  f  1  1  1  0 NA NA NA  NA
2 59  f NA NA NA NA NA NA NA  NA
3 48  m  1  0 NA NA NA NA NA  NA
4 38  m  0  1  0 NA NA NA NA  NA
5 30  f  1  1  0  0 NA NA NA  NA
6 57  m  1  1  1  1  0 NA NA  NA