使用分类变量在 R 中执行 T 检验
performing a T-test in R with categorical variables
大家好,我正在尝试进行 t 检验,但似乎出了点问题...
数据如下:
pot pair type height
I 1 Cross 23,5
I 1 Self 17,375
I 2 Cross 12
I 2 Self 20,375
我执行的 t 检验为:
darwin <- read.table("darwin.txt", header=T)
plot(darwin$type, darwin$height, ylab="Height")
darwin.no.outlier = subset(darwin, height>13)
tapply(darwin.no.outlier$height, darwin.no.outlier$type, var)
t.test(darwin$height ~ darwin$type)
R 给我的错误如下:
错误
if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
另外:警告信息:
1:在 mean.default(x) 中:argument is not numeric or logical: returning NA
2:在 var(x) 中:
Calling var(x) on a factor x is deprecated and will become an error.
Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
3:在 mean.default(y) 中:argument is not numeric or logical: returning NA
4:在 var(y) 中:
Calling var(x) on a factor x is deprecated and will become an error.
Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
问题出在您的小数位,在您的列 height
中,小数点是逗号而不是点。由于小数点的逗号分隔符,您的列正在转换为因数,因此出现错误。
导入数据时,在read.table
中插入"dec = ","
(文件中用于小数点的字符)。所以我用你的数据举例:
darwin <- read.table(text = "pot pair type height
I 1 Cross 23,5
I 1 Self 17,375
I 2 Cross 12
I 2 Self 20,375", header = TRUE, dec = ",")
然后
的输出
t.test(darwin$height ~ darwin$type)
这是:
Welch Two Sample t-test
data: darwin$height by darwin$type
t = -0.18932, df = 1.1355, p-value = 0.878
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-58.34187 56.09187
sample estimates:
mean in group Cross mean in group Self
17.750 18.875
大家好,我正在尝试进行 t 检验,但似乎出了点问题... 数据如下:
pot pair type height
I 1 Cross 23,5
I 1 Self 17,375
I 2 Cross 12
I 2 Self 20,375
我执行的 t 检验为:
darwin <- read.table("darwin.txt", header=T)
plot(darwin$type, darwin$height, ylab="Height")
darwin.no.outlier = subset(darwin, height>13)
tapply(darwin.no.outlier$height, darwin.no.outlier$type, var)
t.test(darwin$height ~ darwin$type)
R 给我的错误如下:
错误
if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
另外:警告信息:
1:在 mean.default(x) 中:argument is not numeric or logical: returning NA
2:在 var(x) 中:
Calling var(x) on a factor x is deprecated and will become an error.
Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
3:在 mean.default(y) 中:argument is not numeric or logical: returning NA
4:在 var(y) 中:
Calling var(x) on a factor x is deprecated and will become an error.
Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
问题出在您的小数位,在您的列 height
中,小数点是逗号而不是点。由于小数点的逗号分隔符,您的列正在转换为因数,因此出现错误。
导入数据时,在read.table
中插入"dec = ","
(文件中用于小数点的字符)。所以我用你的数据举例:
darwin <- read.table(text = "pot pair type height
I 1 Cross 23,5
I 1 Self 17,375
I 2 Cross 12
I 2 Self 20,375", header = TRUE, dec = ",")
然后
的输出t.test(darwin$height ~ darwin$type)
这是:
Welch Two Sample t-test
data: darwin$height by darwin$type
t = -0.18932, df = 1.1355, p-value = 0.878
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-58.34187 56.09187
sample estimates:
mean in group Cross mean in group Self
17.750 18.875