如何 运行 对数据子集进行 t 检验
How to run a t-test on a subset of data
首先,这个数据集的形式是否适合 t 检验?
https://i.stack.imgur.com/tMK6R.png
其次,我正在尝试进行双样本 t 检验,以比较 'outcome 1' 的治疗 a 和 b 在时间 3 的均值。我该怎么做?
示例数据:
df <- structure(list(code = c(100, 100, 100, 101, 101, 101, 102, 102,
102, 103, 103, 103), treatment = c("a", "a", "a", "b", "b", "b",
"a", "a", "a", "b", "b", "b"), sex = c("f", "f", "f", "m", "m",
"m", "f", "f", "f", "f", "f", "f"), time = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3), `outcome 1` = c(21, 23, 33, 44, 45, 47, 22,
34, 22, 55, 45, 56), `outcome 2` = c(21, 32, 33, 33, 44, 45,
22, 57, 98, 65, 42, 42), `outcome 3` = c(62, 84, 63, 51, 45,
74, 85, 34, 96, 86, 45, 47)), .Names = c("code", "treatment",
"sex", "time", "outcome 1", "outcome 2", "outcome 3"),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -12L))
首先您必须定义要测试的子集,然后您可以运行 t 检验。您不必像我所做的那样将子集存储在变量中,但这会使 t 检验输出更清晰。
通常对于 t 检验问题,我会推荐 ?t.test 提供的帮助,但由于这涉及更复杂的子集化,因此我在此处介绍了如何做到这一点:
var_a <- df$`outcome 1`[df$treatment=="a" & df$time==3]
var_b <- df$`outcome 1`[df$treatment=="b" & df$time==3]
t.test(var_a,var_b)
输出:
Welch Two Sample t-test
data: var_a and var_b
t = -3.3773, df = 1.9245, p-value = 0.08182
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-55.754265 7.754265
sample estimates:
mean of x mean of y
27.5 51.5
首先,这个数据集的形式是否适合 t 检验?
https://i.stack.imgur.com/tMK6R.png
其次,我正在尝试进行双样本 t 检验,以比较 'outcome 1' 的治疗 a 和 b 在时间 3 的均值。我该怎么做?
示例数据:
df <- structure(list(code = c(100, 100, 100, 101, 101, 101, 102, 102,
102, 103, 103, 103), treatment = c("a", "a", "a", "b", "b", "b",
"a", "a", "a", "b", "b", "b"), sex = c("f", "f", "f", "m", "m",
"m", "f", "f", "f", "f", "f", "f"), time = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3), `outcome 1` = c(21, 23, 33, 44, 45, 47, 22,
34, 22, 55, 45, 56), `outcome 2` = c(21, 32, 33, 33, 44, 45,
22, 57, 98, 65, 42, 42), `outcome 3` = c(62, 84, 63, 51, 45,
74, 85, 34, 96, 86, 45, 47)), .Names = c("code", "treatment",
"sex", "time", "outcome 1", "outcome 2", "outcome 3"),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -12L))
首先您必须定义要测试的子集,然后您可以运行 t 检验。您不必像我所做的那样将子集存储在变量中,但这会使 t 检验输出更清晰。
通常对于 t 检验问题,我会推荐 ?t.test 提供的帮助,但由于这涉及更复杂的子集化,因此我在此处介绍了如何做到这一点:
var_a <- df$`outcome 1`[df$treatment=="a" & df$time==3]
var_b <- df$`outcome 1`[df$treatment=="b" & df$time==3]
t.test(var_a,var_b)
输出:
Welch Two Sample t-test
data: var_a and var_b
t = -3.3773, df = 1.9245, p-value = 0.08182
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-55.754265 7.754265
sample estimates:
mean of x mean of y
27.5 51.5