如何在 aov() 分析中从预测变量中排除缺失数据?
How do I exclude missing data from my predictor variable in an aov() analysis?
我正在尝试对具有四个级别(HD、HE、EP、ET)的分类预测变量 (ctng) 的数据集执行单向方差分析,并使用 TukeyHSD 测试对其进行分析。但是,我的预测变量有许多缺失值,我想将它们从分析中排除。这些被读取为另一个名为 ""
的级别。这是我的代码的样子:
> GEaov<-aov(ctng~allv$GE.CATIE)
> TukeyHSD(GEaov)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = ctng ~ allv$GE.CATIE)
$`allv$GE.CATIE`
diff lwr upr p adj
EP- 0.04003815 -0.147479895 0.227556198 0.9775550
ET- -0.06458370 -0.400163176 0.270995782 0.9847460
HD- 0.12445374 -0.004557746 0.253465218 0.0647330
HE- -0.17725081 -0.350691202 -0.003810417 0.0423469
ET-EP -0.10462185 -0.461182978 0.251939281 0.9301554
HD-EP 0.08441558 -0.092123972 0.260955141 0.6873773
HE-EP -0.21728896 -0.428485131 -0.006092791 0.0401655
HD-ET 0.18903743 -0.140533172 0.518608038 0.5190113
HE-ET -0.11266711 -0.462029948 0.236695724 0.9039447
HE-HD -0.30170455 -0.463212338 -0.140196753 0.0000038
我尝试将 GE.CATIE 中的空白值更改为 "NA",但随后它做了同样的事情,只是现在它将 "NA" 计为预测变量。 na.action=na.omit
没有任何改变。
# create some data
> xy <- data.frame(var1 = 1:3, var2 = c("a", "b", ""))
# find rows that have `""` in `var2`
> xy$var2 == ""
[1] FALSE FALSE TRUE
# subset these rows from the data.frame's variable `var2`
> xy[xy$var2 == "", "var2"]
[1]
Levels: a b
# change `""` to `NA` (not `"NA"`)
> xy[xy$var2 == "", "var2"] <- NA
# level `""` is now "orphaned". drop it using `droplevels()`
# (see `levels(xy$var2)`)
> droplevels(xy)
var1 var2
1 1 a
2 2 b
3 3 <NA>
NAs 条目将被 aov
自动删除。
我正在尝试对具有四个级别(HD、HE、EP、ET)的分类预测变量 (ctng) 的数据集执行单向方差分析,并使用 TukeyHSD 测试对其进行分析。但是,我的预测变量有许多缺失值,我想将它们从分析中排除。这些被读取为另一个名为 ""
的级别。这是我的代码的样子:
> GEaov<-aov(ctng~allv$GE.CATIE)
> TukeyHSD(GEaov)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = ctng ~ allv$GE.CATIE)
$`allv$GE.CATIE`
diff lwr upr p adj
EP- 0.04003815 -0.147479895 0.227556198 0.9775550
ET- -0.06458370 -0.400163176 0.270995782 0.9847460
HD- 0.12445374 -0.004557746 0.253465218 0.0647330
HE- -0.17725081 -0.350691202 -0.003810417 0.0423469
ET-EP -0.10462185 -0.461182978 0.251939281 0.9301554
HD-EP 0.08441558 -0.092123972 0.260955141 0.6873773
HE-EP -0.21728896 -0.428485131 -0.006092791 0.0401655
HD-ET 0.18903743 -0.140533172 0.518608038 0.5190113
HE-ET -0.11266711 -0.462029948 0.236695724 0.9039447
HE-HD -0.30170455 -0.463212338 -0.140196753 0.0000038
我尝试将 GE.CATIE 中的空白值更改为 "NA",但随后它做了同样的事情,只是现在它将 "NA" 计为预测变量。 na.action=na.omit
没有任何改变。
# create some data
> xy <- data.frame(var1 = 1:3, var2 = c("a", "b", ""))
# find rows that have `""` in `var2`
> xy$var2 == ""
[1] FALSE FALSE TRUE
# subset these rows from the data.frame's variable `var2`
> xy[xy$var2 == "", "var2"]
[1]
Levels: a b
# change `""` to `NA` (not `"NA"`)
> xy[xy$var2 == "", "var2"] <- NA
# level `""` is now "orphaned". drop it using `droplevels()`
# (see `levels(xy$var2)`)
> droplevels(xy)
var1 var2
1 1 a
2 2 b
3 3 <NA>
NAs 条目将被 aov
自动删除。