densityplot mice 中的错误 - 缺少数据示例
error in densityplot mice- missing data example
我有以下数据:
dput(example)
structure(list(q1 = c(5, 22, 16, 24, 9, 20, 21, 16, 28, 28, 24,
25, 34, 22, 29, NA, 24, 13, 10, 17, 24, 21, 22, 35, 20, 25, 25,
23, 22, 20, 27, 22, 20, 23, 5, 21, 19, 17, 27, 20, 35, 35, 10,
16, 22, 34, 34, 23, 25, 23, 25, 30, 18, 21, 15, 23, 5, 35, 5,
30), q2 = c(5, 5, 24, 15, 5, 5, 26, 23, 24, 9, 24, 5, 15, 26,
30, 14, 14, 19, 11, 25, 20, 5, 14, 13, 11, 10, 13, 16, 16, 21,
10, 12, 20, 9, 15, 5, 13, 5, 30, 18, 12, 27, 10, 9, 20, 5, 9,
10, 11, 26, 22, 8, 6, 5, 15, 6, 5, 35, 10, 18), q3 = c(11, 22,
NA, 22, 6, 18, 30, 6, 26, NA, 17, 22, 33, 19, 22, 25, 23, 13,
13, 15, 16, 16, 23, 24, 6, 25, 27, 12, 25, 17, 28, 15, 20, 31,
5, 17, 17, 20, 24, 7, 35, 35, 10, 10, 20, 10, 31, 21, 16, 32,
25, 30, 10, 24, 15, 24, 5, 35, 9, 26), q4 = c(14, 15, 23, 21,
NA, 25, 30, 23, 28, 20, 25, 5, 35, 30, 19, 23, 30, 5, 23, 18,
30, 15, 30, 22, 8, 29, 35, 23, 23, 24, 25, 25, 20, 25, 5, 15,
34, 8, 32, 35, 35, 35, 10, 6, 21, 10, 24, 27, 10, 30, 35, 15,
6, 21, 15, 15, 5, 35, 19, 26), q5 = c(5, 18, 21, 19, 5, 6, 5,
29, 20, 23, 22, 5, 16, 22, 12, 13, 18, 5, 17, 15, 18, 16, 20,
8, 12, 19, 12, 23, 9, 16, 5, 29, 20, 5, 5, 5, 5, 5, 30, 22, 32,
35, 10, 13, 20, 13, 12, 16, 5, 24, 22, 17, 5, 20, 14, 5, 5, 35,
15, 16), q6 = c(15, 9, 25, 26, 6, 17, 28, 32, 26, 28, 24, 25,
11, 24, 31, 18, 19, 6, 20, 26, 29, 17, 21, 24, 7, 29, 17, 17,
14, 25, 24, 35, 24, 6, 16, 6, 9, 6, 38, 19, 30, 42, 12, 20, 27,
26, 25, 13, 9, 36, 27, 27, 7, 24, 22, 6, 16, 42, 14, 11)), class = "data.frame", row.names = c(NA,
-60L))
然后我用鼠标:
*编辑:忘记完整的行
library(mice)
imp <- mice(example,m=5,maxit=50,meth='pmm',seed=500)
example_i <- complete(imp,1)
但是在尝试获取密度图时出现以下错误:
densityplot(imp)
Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~
^
我的问题是:
- 我估算缺失数据的方法是否存在根本性错误? (这只是一个小例子)
- 我是否正确使用了 MICE 参数?
- 我在密度图上做错了什么,因为我已经得到了我正在使用的所有其他比例尺?
回答
您需要为 densityplot
提供一个公式,否则它将绘制所有具有 > 2 个缺失值的变量。由于您没有任何具有 2 > 缺失值的变量,并且由于 densityplot
不希望出现这种情况,因此它会产生这个神秘的错误。
有效的例子
example$q4[1:10] <- NA
imp <- mice(example, m = 5, maxit = 50, meth = "pmm", seed = 500)
densityplot(imp)
# equivalent: densityplot(imp, ~ q4)
理由
imp
属于 class mids
,所以您调用的是 densityplot.mids
。通常,densityplot.mids
要求您提供一个公式(data
参数),以便它知道要绘制哪些变量(参见 ?densityplot.mids
)。如果要绘制q4
,则代码为densityplot(imp, ~ q4)
.
在 densityplot.mids
中,我们看到:
if (missing(data)) {
vnames <- vnames[!allfactors & x$nmis > 2 & x$nmis <
nrow(x$data) - 1]
formula <- as.formula(paste("~", paste(vnames,
collapse = "+", sep = ""), sep = ""))
}
如果我们在收到错误后立即使用 traceback()
,那么您会看到上面的最后一行是引发错误的行。
第一行,可以看到条件xnmis > 2
,意思是会抓取所有缺失值大于2的列。当没有列满足条件时,vnames
将计算为 character(0)
,因此后续行产生输出 ~
,即您在错误中看到的代码。
那么,为什么缺失的太少会报错呢?那是因为 densityplot
绘制了一个分布,绘制 1 或 2 个点的分布是不可行的。
建议
包维护者可以通过简单地检查 vnames
是否有任何内容来改进错误,如果没有,他们可以抛出一个提供信息的错误。如果您认为它有用,您可能想将其添加为 an issue on Github。
我有以下数据:
dput(example)
structure(list(q1 = c(5, 22, 16, 24, 9, 20, 21, 16, 28, 28, 24,
25, 34, 22, 29, NA, 24, 13, 10, 17, 24, 21, 22, 35, 20, 25, 25,
23, 22, 20, 27, 22, 20, 23, 5, 21, 19, 17, 27, 20, 35, 35, 10,
16, 22, 34, 34, 23, 25, 23, 25, 30, 18, 21, 15, 23, 5, 35, 5,
30), q2 = c(5, 5, 24, 15, 5, 5, 26, 23, 24, 9, 24, 5, 15, 26,
30, 14, 14, 19, 11, 25, 20, 5, 14, 13, 11, 10, 13, 16, 16, 21,
10, 12, 20, 9, 15, 5, 13, 5, 30, 18, 12, 27, 10, 9, 20, 5, 9,
10, 11, 26, 22, 8, 6, 5, 15, 6, 5, 35, 10, 18), q3 = c(11, 22,
NA, 22, 6, 18, 30, 6, 26, NA, 17, 22, 33, 19, 22, 25, 23, 13,
13, 15, 16, 16, 23, 24, 6, 25, 27, 12, 25, 17, 28, 15, 20, 31,
5, 17, 17, 20, 24, 7, 35, 35, 10, 10, 20, 10, 31, 21, 16, 32,
25, 30, 10, 24, 15, 24, 5, 35, 9, 26), q4 = c(14, 15, 23, 21,
NA, 25, 30, 23, 28, 20, 25, 5, 35, 30, 19, 23, 30, 5, 23, 18,
30, 15, 30, 22, 8, 29, 35, 23, 23, 24, 25, 25, 20, 25, 5, 15,
34, 8, 32, 35, 35, 35, 10, 6, 21, 10, 24, 27, 10, 30, 35, 15,
6, 21, 15, 15, 5, 35, 19, 26), q5 = c(5, 18, 21, 19, 5, 6, 5,
29, 20, 23, 22, 5, 16, 22, 12, 13, 18, 5, 17, 15, 18, 16, 20,
8, 12, 19, 12, 23, 9, 16, 5, 29, 20, 5, 5, 5, 5, 5, 30, 22, 32,
35, 10, 13, 20, 13, 12, 16, 5, 24, 22, 17, 5, 20, 14, 5, 5, 35,
15, 16), q6 = c(15, 9, 25, 26, 6, 17, 28, 32, 26, 28, 24, 25,
11, 24, 31, 18, 19, 6, 20, 26, 29, 17, 21, 24, 7, 29, 17, 17,
14, 25, 24, 35, 24, 6, 16, 6, 9, 6, 38, 19, 30, 42, 12, 20, 27,
26, 25, 13, 9, 36, 27, 27, 7, 24, 22, 6, 16, 42, 14, 11)), class = "data.frame", row.names = c(NA,
-60L))
然后我用鼠标:
*编辑:忘记完整的行
library(mice)
imp <- mice(example,m=5,maxit=50,meth='pmm',seed=500)
example_i <- complete(imp,1)
但是在尝试获取密度图时出现以下错误:
densityplot(imp)
Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~
^
我的问题是:
- 我估算缺失数据的方法是否存在根本性错误? (这只是一个小例子)
- 我是否正确使用了 MICE 参数?
- 我在密度图上做错了什么,因为我已经得到了我正在使用的所有其他比例尺?
回答
您需要为 densityplot
提供一个公式,否则它将绘制所有具有 > 2 个缺失值的变量。由于您没有任何具有 2 > 缺失值的变量,并且由于 densityplot
不希望出现这种情况,因此它会产生这个神秘的错误。
有效的例子
example$q4[1:10] <- NA
imp <- mice(example, m = 5, maxit = 50, meth = "pmm", seed = 500)
densityplot(imp)
# equivalent: densityplot(imp, ~ q4)
理由
imp
属于 class mids
,所以您调用的是 densityplot.mids
。通常,densityplot.mids
要求您提供一个公式(data
参数),以便它知道要绘制哪些变量(参见 ?densityplot.mids
)。如果要绘制q4
,则代码为densityplot(imp, ~ q4)
.
在 densityplot.mids
中,我们看到:
if (missing(data)) {
vnames <- vnames[!allfactors & x$nmis > 2 & x$nmis <
nrow(x$data) - 1]
formula <- as.formula(paste("~", paste(vnames,
collapse = "+", sep = ""), sep = ""))
}
如果我们在收到错误后立即使用 traceback()
,那么您会看到上面的最后一行是引发错误的行。
第一行,可以看到条件xnmis > 2
,意思是会抓取所有缺失值大于2的列。当没有列满足条件时,vnames
将计算为 character(0)
,因此后续行产生输出 ~
,即您在错误中看到的代码。
那么,为什么缺失的太少会报错呢?那是因为 densityplot
绘制了一个分布,绘制 1 或 2 个点的分布是不可行的。
建议
包维护者可以通过简单地检查 vnames
是否有任何内容来改进错误,如果没有,他们可以抛出一个提供信息的错误。如果您认为它有用,您可能想将其添加为 an issue on Github。