densityplot mice 中的错误 - 缺少数据示例

error in densityplot mice- missing data example

我有以下数据:

dput(example)
structure(list(q1 = c(5, 22, 16, 24, 9, 20, 21, 16, 28, 28, 24, 
25, 34, 22, 29, NA, 24, 13, 10, 17, 24, 21, 22, 35, 20, 25, 25, 
23, 22, 20, 27, 22, 20, 23, 5, 21, 19, 17, 27, 20, 35, 35, 10, 
16, 22, 34, 34, 23, 25, 23, 25, 30, 18, 21, 15, 23, 5, 35, 5, 
30), q2 = c(5, 5, 24, 15, 5, 5, 26, 23, 24, 9, 24, 5, 15, 26, 
30, 14, 14, 19, 11, 25, 20, 5, 14, 13, 11, 10, 13, 16, 16, 21, 
10, 12, 20, 9, 15, 5, 13, 5, 30, 18, 12, 27, 10, 9, 20, 5, 9, 
10, 11, 26, 22, 8, 6, 5, 15, 6, 5, 35, 10, 18), q3 = c(11, 22, 
NA, 22, 6, 18, 30, 6, 26, NA, 17, 22, 33, 19, 22, 25, 23, 13, 
13, 15, 16, 16, 23, 24, 6, 25, 27, 12, 25, 17, 28, 15, 20, 31, 
5, 17, 17, 20, 24, 7, 35, 35, 10, 10, 20, 10, 31, 21, 16, 32, 
25, 30, 10, 24, 15, 24, 5, 35, 9, 26), q4 = c(14, 15, 23, 21, 
NA, 25, 30, 23, 28, 20, 25, 5, 35, 30, 19, 23, 30, 5, 23, 18, 
30, 15, 30, 22, 8, 29, 35, 23, 23, 24, 25, 25, 20, 25, 5, 15, 
34, 8, 32, 35, 35, 35, 10, 6, 21, 10, 24, 27, 10, 30, 35, 15, 
6, 21, 15, 15, 5, 35, 19, 26), q5 = c(5, 18, 21, 19, 5, 6, 5, 
29, 20, 23, 22, 5, 16, 22, 12, 13, 18, 5, 17, 15, 18, 16, 20, 
8, 12, 19, 12, 23, 9, 16, 5, 29, 20, 5, 5, 5, 5, 5, 30, 22, 32, 
35, 10, 13, 20, 13, 12, 16, 5, 24, 22, 17, 5, 20, 14, 5, 5, 35, 
15, 16), q6 = c(15, 9, 25, 26, 6, 17, 28, 32, 26, 28, 24, 25, 
11, 24, 31, 18, 19, 6, 20, 26, 29, 17, 21, 24, 7, 29, 17, 17, 
14, 25, 24, 35, 24, 6, 16, 6, 9, 6, 38, 19, 30, 42, 12, 20, 27, 
26, 25, 13, 9, 36, 27, 27, 7, 24, 22, 6, 16, 42, 14, 11)), class = "data.frame", row.names = c(NA, 
-60L))

然后我用鼠标:

*编辑:忘记完整的行

library(mice)
imp <- mice(example,m=5,maxit=50,meth='pmm',seed=500)
example_i <- complete(imp,1)

但是在尝试获取密度图时出现以下错误:

 densityplot(imp)
Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~
   ^ 

我的问题是:

  1. 我估算缺失数据的方法是否存在根本性错误? (这只是一个小例子)
  2. 我是否正确使用了 MICE 参数?
  3. 我在密度图上做错了什么,因为我已经得到了我正在使用的所有其他比例尺?

回答

您需要为 densityplot 提供一个公式,否则它将绘制所有具有 > 2 个缺失值的变量。由于您没有任何具有 2 > 缺失值的变量,并且由于 densityplot 不希望出现这种情况,因此它会产生这个神秘的错误。


有效的例子

example$q4[1:10] <- NA
imp <- mice(example, m = 5, maxit = 50, meth = "pmm", seed = 500)
densityplot(imp) 
# equivalent: densityplot(imp, ~ q4)


理由

imp 属于 class mids,所以您调用的是 densityplot.mids。通常,densityplot.mids 要求您提供一个公式(data 参数),以便它知道要绘制哪些变量(参见 ?densityplot.mids)。如果要绘制q4,则代码为densityplot(imp, ~ q4).

densityplot.mids 中,我们看到:

if (missing(data)) {
    vnames <- vnames[!allfactors & x$nmis > 2 & x$nmis < 
        nrow(x$data) - 1]
    formula <- as.formula(paste("~", paste(vnames, 
        collapse = "+", sep = ""), sep = ""))
}

如果我们在收到错误后立即使用 traceback(),那么您会看到上面的最后一行是引发错误的行。

第一行,可以看到条件xnmis > 2,意思是会抓取所有缺失值大于2的列。当没有列满足条件时,vnames 将计算为 character(0),因此后续行产生输出 ~,即您在错误中看到的代码。

那么,为什么缺失的太少会报错呢?那是因为 densityplot 绘制了一个分布,绘制 1 或 2 个点的分布是不可行的。


建议

包维护者可以通过简单地检查 vnames 是否有任何内容来改进错误,如果没有,他们可以抛出一个提供信息的错误。如果您认为它有用,您可能想将其添加为 an issue on Github