具有多个组和多个特征的未配对 t 检验的 R 代码
R code for unpaired t-test with multiple groups & multiple traits
假设我有以下数据:
Class,a,b,c,d,e,f,g,h,i,j,k,l,m
A,0.1,0.4,0.4,0.7,0.1,0.1,0.9,0.5,0.5,0.9,0.6,0.1,0.9
A,0.0,0.3,0.7,0.1,0.2,0.1,0.3,0.2,1.0,0.9,0.6,0.6,0.1
B,0.6,0.3,0.4,0.6,0.3,0.8,0.3,0.0,0.8,0.9,0.6,0.7,0.5
B,0.8,0.3,0.8,0.8,0.3,0.7,0.3,0.7,0.7,0.4,0.9,0.3,0.8
C,0.8,0.1,0.0,0.0,0.2,0.0,0.7,0.3,0.1,0.6,0.8,0.0,0.3
C,0.3,0.0,0.7,1.0,0.8,0.3,0.6,0.4,0.9,0.9,0.1,0.9,1.0
D,0.9,0.7,1.0,0.1,0.2,0.5,0.9,0.9,0.6,0.8,0.6,0.4,0.2
D,0.6,0.2,0.8,0.8,0.1,0.3,0.0,0.0,0.6,0.3,0.6,0.9,0.3
E,1.0,0.5,0.5,0.0,0.7,0.9,1.0,0.1,0.5,0.5,0.1,0.9,0.6
E,0.5,0.8,0.5,0.1,0.7,0.6,0.1,0.8,0.2,0.1,0.7,0.6,0.4
F,0.6,0.4,0.6,0.2,0.5,0.4,0.2,0.2,0.3,0.4,0.9,0.5,0.7
F,0.9,0.6,0.8,0.6,0.6,0.3,0.8,0.5,0.4,0.4,0.5,0.7,0.4
G,0.2,0.1,0.6,0.4,0.3,0.8,0.4,0.1,0.9,0.7,0.0,0.5,0.8
G,0.2,0.3,0.2,0.8,0.3,0.6,0.2,0.6,0.9,0.1,0.2,0.1,0.3
H,0.7,0.4,0.7,0.2,0.5,0.5,0.5,0.9,0.8,0.4,0.9,0.7,0.5
H,0.1,0.6,0.5,0.7,0.5,0.4,0.5,0.4,0.1,0.6,0.1,0.3,0.5
I,0.6,0.8,0.7,0.6,0.3,0.5,0.2,0.6,0.5,0.3,0.9,0.8,0.5
I,0.1,0.0,0.5,0.2,0.3,0.2,0.4,0.9,0.9,0.8,0.5,0.3,0.2
J,0.6,0.1,0.2,0.3,0.3,0.6,0.5,0.9,0.0,0.2,0.4,0.8,0.9
J,0.3,0.7,0.4,0.1,0.4,0.1,0.7,0.5,0.1,0.0,0.1,0.2,0.6
K,0.1,0.9,0.5,0.0,0.0,0.3,0.0,0.7,0.2,0.0,0.0,0.6,0.6
K,0.9,0.6,0.1,0.9,0.3,0.2,0.1,0.2,0.2,0.1,0.7,0.8,0.0
L,1.0,0.6,0.0,0.0,0.8,0.5,0.7,0.9,0.3,0.5,0.4,0.9,0.4
L,0.6,0.2,0.5,0.2,0.7,0.6,0.4,0.5,0.8,0.6,0.8,0.3,0.9
M,0.7,0.1,0.1,0.8,0.4,1.0,0.9,0.1,0.1,0.5,0.5,0.6,0.3
M,0.8,0.4,0.2,0.4,0.2,0.2,0.3,0.5,0.5,0.3,0.3,1.0,0.3
N,0.0,0.4,0.5,0.1,0.6,0.5,0.3,0.4,0.8,0.7,0.1,0.2,0.8
N,0.9,0.4,1.0,0.9,0.7,0.5,0.7,0.4,0.8,0.8,0.1,0.7,0.2
O,0.2,0.1,1.0,0.2,0.0,0.8,0.4,0.9,0.1,1.0,0.8,0.3,0.5
O,0.6,1.0,1.0,0.8,1.0,0.6,0.4,0.3,0.2,0.8,0.5,0.0,0.1
P,0.2,0.1,0.7,0.8,0.3,0.2,0.4,0.4,1.0,0.2,0.7,0.1,0.0
P,0.3,0.6,0.6,0.5,0.9,0.1,0.9,0.4,0.5,0.2,0.7,0.8,0.2
Q,0.9,0.7,0.6,0.7,0.4,0.3,0.3,0.8,0.6,0.6,1.0,0.1,0.7
Q,0.6,0.9,0.9,0.1,0.8,0.7,0.0,0.1,0.7,0.0,0.9,0.7,0.8
R,0.1,0.7,0.5,0.9,0.9,0.1,1.0,0.1,0.6,0.2,0.5,0.5,0.3
R,0.1,0.6,0.9,0.6,0.6,0.8,1.0,0.3,0.7,0.9,0.1,0.8,0.2
structure(list(Class = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L, 11L,
12L, 12L, 13L, 13L, 14L, 14L, 15L, 15L, 16L, 16L, 17L, 17L, 18L,
18L), .Label = c("A", "B", "C", "D", "E", "F", "G", "H", "I",
"J", "K", "L", "M", "N", "O", "P", "Q", "R"), class = "factor"),
a = c(0.7, 0.4, 0.9, 0.5, 0.4, 0.5, 0, 0.3, 0.7, 0.7, 0.6,
0.5, 0.4, 0.1, 0.3, 0.1, 0.3, 0.4, 0.5, 0.8, 0.8, 0.6, 0.2,
0.3, 0.2, 0.6, 0.6, 0.4, 0.9, 0.1, 0.6, 0.6, 0.7, 0.7, 1,
0.2), b = c(0.3, 0.1, 0.3, 1, 0.2, 0.8, 0.9, 0.5, 0.6, 0.3,
0.2, 0.9, 0.9, 0.8, 0, 0.2, 0.8, 0.7, 0.8, 0.8, 0.5, 0, 0.3,
0.9, 0.4, 0.9, 0.1, 0.4, 0.5, 0.7, 0.8, 0.9, 0.1, 0.8, 1,
0.7), c = c(0.2, 0.6, 0.7, 0.5, 0.4, 0.7, 0.9, 0.1, 0.2,
0.8, 0.8, 0.9, 0.2, 0.6, 0.4, 0.3, 0.1, 0.9, 0, 0.8, 0.5,
0.9, 0.4, 0.9, 0.8, 0.4, 0.9, 0.7, 0.9, 1, 0.2, 0.1, 1, 0.9,
0.8, 0.8), d = c(0.8, 0.1, 0.5, 0.8, 0.3, 0.2, 0.9, 0.9,
0, 0.3, 0.7, 0, 0.1, 0, 0, 0.8, 1, 0.8, 0.6, 1, 0.6, 0.6,
0.2, 0.1, 1, 0.1, 0.7, 0.3, 0.7, 0.3, 0.9, 0.8, 0.1, 0.2,
0.1, 0.4), e = c(0.6, 0.6, 0.3, 0.6, 0.7, 0.6, 0.9, 0.8,
0.6, 0.4, 0.6, 0.7, 0.7, 0.2, 1, 1, 1, 0.5, 0.4, 0.5, 0.6,
0.7, 0.1, 0.1, 0.6, 0.2, 0.4, 1, 0.1, 0.3, 0.7, 0, 0.4, 0.2,
0.7, 0.1), f = c(0.9, 0.2, 0.4, 0.8, 0, 0.2, 1, 0.2, 0.1,
0.1, 0.4, 1, 0.1, 0.9, 0.2, 0.2, 0.3, 0.6, 0.9, 0.2, 0.5,
0.4, 0.4, 0.2, 0.6, 0.1, 0.6, 0.1, 0.5, 0.1, 0.6, 0.9, 0.5,
0.4, 0.5, 0.8), g = c(0.6, 0.3, 0.9, 0.1, 0.9, 0.3, 0.1,
0.2, 0, 0.4, 1, 0.3, 0.7, 0.8, 0.5, 0.5, 0.6, 0.2, 0.2, 0,
1, 0.6, 0.2, 0.7, 0.6, 0.6, 0.1, 0.4, 0.6, 0.2, 0.6, 0.4,
0.6, 0.7, 0.3, 1), h = c(0.1, 0.8, 0.3, 0.3, 0.3, 0.8, 0.3,
0.7, 0.4, 0.8, 0.3, 0.1, 0.5, 0.7, 0.1, 0.3, 0.7, 0.4, 0.7,
0.5, 0.4, 0.1, 0.9, 0.8, 0.9, 0.7, 0.3, 0.1, 0.3, 0.1, 0.2,
0.1, 0.9, 0.3, 0.1, 0.6), i = c(0.8, 0.8, 0.4, 0.9, 0.5,
0.8, 0.6, 0.6, 0.2, 0.7, 0.1, 0.8, 0.1, 0.5, 0.7, 0.6, 0.8,
0.4, 0.2, 0.9, 0.7, 0.8, 0.5, 0.8, 0.3, 0.3, 1, 0.6, 0.9,
0.9, 0.7, 0.5, 0.1, 1, 0.5, 0.1), j = c(0.5, 0, 0.4, 0.7,
0.5, 0.6, 0.3, 0.9, 0.1, 0.4, 1, 0.5, 0.7, 0.4, 0.3, 0.9,
0, 0.5, 0.7, 0.4, 0, 0.1, 0.4, 0.3, 0.6, 0.9, 0.5, 0.5, 0.9,
0.5, 0.5, 0.3, 0.1, 0.1, 0.9, 0.7), k = c(0.8, 0.1, 0, 0.5,
0.7, 0.9, 0.4, 0.9, 0.3, 0.8, 0.7, 0.8, 0.2, 0.2, 0.8, 0.4,
0.8, 0.7, 0.2, 0.6, 0.5, 0.4, 1, 0.2, 0.7, 0.3, 0.1, 0, 0.4,
0.7, 0.8, 0.3, 0.3, 0.8, 0.5, 0.9), l = c(0.8, 0.6, 0.3,
0.6, 0.6, 0.1, 0.2, 0.3, 0.7, 0.6, 0.1, 0.8, 1, 1, 0.5, 0.3,
0.3, 0, 0.4, 0.1, 0.5, 0.6, 0.9, 0.4, 0.6, 0.4, 0.3, 0.4,
0.5, 0.5, 0.5, 0.5, 0.9, 0.2, 0.5, 0.9), m = c(0.4, 0, 0.7,
0.1, 0.9, 0.8, 0.6, 0.1, 0.2, 0.2, 0.9, 0.9, 0.4, 0.4, 0.9,
0.6, 0.2, 0.1, 0.7, 0.5, 0.2, 0.8, 0.4, 0.6, 0.9, 0.7, 0,
0.8, 0.7, 0.2, 0.1, 0.8, 0.2, 0.2, 0.3, 0.2)), row.names = c(NA,
-36L), class = "data.frame")
我想 运行 在列(即 A-B、A-C、B-C 等)内的行(即 A-B、A-C、B-C 等)之间进行非配对 t 检验 "a",然后是 "b" 内的 A-B、A-C、B-C 等)。有没有一种简单的方法可以用 R 来做到这一点,而不是 运行 单独地每个基因?理想情况下,我想要每个基因中每个样本配对的 p 值输出列表。
非常感谢,
L.
我们可以使用 combn
得到 'Class' 列值的成对组合并在 t.test
中使用
lstOut <- combn(unique(df1$Class), 2, FUN = function(x) {
dat1 <- subset(df1, Class == x[1])
dat2 <- subset(df1, Class == x[2])
Map(function(x, y) tryCatch(t.test(x, y),
error = function(e) NA), dat1[-1], dat2[-1])
}, simplify = FALSE)
names(lstOut) <- combn(as.character(unique(df1$Class)), 2, FUN = paste, collapse="_")
list
输出可以 tidy
并转换为单个 data.frame
library(broom)
library(purrr)
out <- map_depth(lstOut, .depth = 2, tidy)
out2 <- map_dfr(out, ~bind_rows(.x, .id = 'colname'), .id = 'classCompare')
假设我有以下数据:
Class,a,b,c,d,e,f,g,h,i,j,k,l,m A,0.1,0.4,0.4,0.7,0.1,0.1,0.9,0.5,0.5,0.9,0.6,0.1,0.9 A,0.0,0.3,0.7,0.1,0.2,0.1,0.3,0.2,1.0,0.9,0.6,0.6,0.1 B,0.6,0.3,0.4,0.6,0.3,0.8,0.3,0.0,0.8,0.9,0.6,0.7,0.5 B,0.8,0.3,0.8,0.8,0.3,0.7,0.3,0.7,0.7,0.4,0.9,0.3,0.8 C,0.8,0.1,0.0,0.0,0.2,0.0,0.7,0.3,0.1,0.6,0.8,0.0,0.3 C,0.3,0.0,0.7,1.0,0.8,0.3,0.6,0.4,0.9,0.9,0.1,0.9,1.0 D,0.9,0.7,1.0,0.1,0.2,0.5,0.9,0.9,0.6,0.8,0.6,0.4,0.2 D,0.6,0.2,0.8,0.8,0.1,0.3,0.0,0.0,0.6,0.3,0.6,0.9,0.3 E,1.0,0.5,0.5,0.0,0.7,0.9,1.0,0.1,0.5,0.5,0.1,0.9,0.6 E,0.5,0.8,0.5,0.1,0.7,0.6,0.1,0.8,0.2,0.1,0.7,0.6,0.4 F,0.6,0.4,0.6,0.2,0.5,0.4,0.2,0.2,0.3,0.4,0.9,0.5,0.7 F,0.9,0.6,0.8,0.6,0.6,0.3,0.8,0.5,0.4,0.4,0.5,0.7,0.4 G,0.2,0.1,0.6,0.4,0.3,0.8,0.4,0.1,0.9,0.7,0.0,0.5,0.8 G,0.2,0.3,0.2,0.8,0.3,0.6,0.2,0.6,0.9,0.1,0.2,0.1,0.3 H,0.7,0.4,0.7,0.2,0.5,0.5,0.5,0.9,0.8,0.4,0.9,0.7,0.5 H,0.1,0.6,0.5,0.7,0.5,0.4,0.5,0.4,0.1,0.6,0.1,0.3,0.5 I,0.6,0.8,0.7,0.6,0.3,0.5,0.2,0.6,0.5,0.3,0.9,0.8,0.5 I,0.1,0.0,0.5,0.2,0.3,0.2,0.4,0.9,0.9,0.8,0.5,0.3,0.2 J,0.6,0.1,0.2,0.3,0.3,0.6,0.5,0.9,0.0,0.2,0.4,0.8,0.9 J,0.3,0.7,0.4,0.1,0.4,0.1,0.7,0.5,0.1,0.0,0.1,0.2,0.6 K,0.1,0.9,0.5,0.0,0.0,0.3,0.0,0.7,0.2,0.0,0.0,0.6,0.6 K,0.9,0.6,0.1,0.9,0.3,0.2,0.1,0.2,0.2,0.1,0.7,0.8,0.0 L,1.0,0.6,0.0,0.0,0.8,0.5,0.7,0.9,0.3,0.5,0.4,0.9,0.4 L,0.6,0.2,0.5,0.2,0.7,0.6,0.4,0.5,0.8,0.6,0.8,0.3,0.9 M,0.7,0.1,0.1,0.8,0.4,1.0,0.9,0.1,0.1,0.5,0.5,0.6,0.3 M,0.8,0.4,0.2,0.4,0.2,0.2,0.3,0.5,0.5,0.3,0.3,1.0,0.3 N,0.0,0.4,0.5,0.1,0.6,0.5,0.3,0.4,0.8,0.7,0.1,0.2,0.8 N,0.9,0.4,1.0,0.9,0.7,0.5,0.7,0.4,0.8,0.8,0.1,0.7,0.2 O,0.2,0.1,1.0,0.2,0.0,0.8,0.4,0.9,0.1,1.0,0.8,0.3,0.5 O,0.6,1.0,1.0,0.8,1.0,0.6,0.4,0.3,0.2,0.8,0.5,0.0,0.1 P,0.2,0.1,0.7,0.8,0.3,0.2,0.4,0.4,1.0,0.2,0.7,0.1,0.0 P,0.3,0.6,0.6,0.5,0.9,0.1,0.9,0.4,0.5,0.2,0.7,0.8,0.2 Q,0.9,0.7,0.6,0.7,0.4,0.3,0.3,0.8,0.6,0.6,1.0,0.1,0.7 Q,0.6,0.9,0.9,0.1,0.8,0.7,0.0,0.1,0.7,0.0,0.9,0.7,0.8 R,0.1,0.7,0.5,0.9,0.9,0.1,1.0,0.1,0.6,0.2,0.5,0.5,0.3 R,0.1,0.6,0.9,0.6,0.6,0.8,1.0,0.3,0.7,0.9,0.1,0.8,0.2
structure(list(Class = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L, 11L,
12L, 12L, 13L, 13L, 14L, 14L, 15L, 15L, 16L, 16L, 17L, 17L, 18L,
18L), .Label = c("A", "B", "C", "D", "E", "F", "G", "H", "I",
"J", "K", "L", "M", "N", "O", "P", "Q", "R"), class = "factor"),
a = c(0.7, 0.4, 0.9, 0.5, 0.4, 0.5, 0, 0.3, 0.7, 0.7, 0.6,
0.5, 0.4, 0.1, 0.3, 0.1, 0.3, 0.4, 0.5, 0.8, 0.8, 0.6, 0.2,
0.3, 0.2, 0.6, 0.6, 0.4, 0.9, 0.1, 0.6, 0.6, 0.7, 0.7, 1,
0.2), b = c(0.3, 0.1, 0.3, 1, 0.2, 0.8, 0.9, 0.5, 0.6, 0.3,
0.2, 0.9, 0.9, 0.8, 0, 0.2, 0.8, 0.7, 0.8, 0.8, 0.5, 0, 0.3,
0.9, 0.4, 0.9, 0.1, 0.4, 0.5, 0.7, 0.8, 0.9, 0.1, 0.8, 1,
0.7), c = c(0.2, 0.6, 0.7, 0.5, 0.4, 0.7, 0.9, 0.1, 0.2,
0.8, 0.8, 0.9, 0.2, 0.6, 0.4, 0.3, 0.1, 0.9, 0, 0.8, 0.5,
0.9, 0.4, 0.9, 0.8, 0.4, 0.9, 0.7, 0.9, 1, 0.2, 0.1, 1, 0.9,
0.8, 0.8), d = c(0.8, 0.1, 0.5, 0.8, 0.3, 0.2, 0.9, 0.9,
0, 0.3, 0.7, 0, 0.1, 0, 0, 0.8, 1, 0.8, 0.6, 1, 0.6, 0.6,
0.2, 0.1, 1, 0.1, 0.7, 0.3, 0.7, 0.3, 0.9, 0.8, 0.1, 0.2,
0.1, 0.4), e = c(0.6, 0.6, 0.3, 0.6, 0.7, 0.6, 0.9, 0.8,
0.6, 0.4, 0.6, 0.7, 0.7, 0.2, 1, 1, 1, 0.5, 0.4, 0.5, 0.6,
0.7, 0.1, 0.1, 0.6, 0.2, 0.4, 1, 0.1, 0.3, 0.7, 0, 0.4, 0.2,
0.7, 0.1), f = c(0.9, 0.2, 0.4, 0.8, 0, 0.2, 1, 0.2, 0.1,
0.1, 0.4, 1, 0.1, 0.9, 0.2, 0.2, 0.3, 0.6, 0.9, 0.2, 0.5,
0.4, 0.4, 0.2, 0.6, 0.1, 0.6, 0.1, 0.5, 0.1, 0.6, 0.9, 0.5,
0.4, 0.5, 0.8), g = c(0.6, 0.3, 0.9, 0.1, 0.9, 0.3, 0.1,
0.2, 0, 0.4, 1, 0.3, 0.7, 0.8, 0.5, 0.5, 0.6, 0.2, 0.2, 0,
1, 0.6, 0.2, 0.7, 0.6, 0.6, 0.1, 0.4, 0.6, 0.2, 0.6, 0.4,
0.6, 0.7, 0.3, 1), h = c(0.1, 0.8, 0.3, 0.3, 0.3, 0.8, 0.3,
0.7, 0.4, 0.8, 0.3, 0.1, 0.5, 0.7, 0.1, 0.3, 0.7, 0.4, 0.7,
0.5, 0.4, 0.1, 0.9, 0.8, 0.9, 0.7, 0.3, 0.1, 0.3, 0.1, 0.2,
0.1, 0.9, 0.3, 0.1, 0.6), i = c(0.8, 0.8, 0.4, 0.9, 0.5,
0.8, 0.6, 0.6, 0.2, 0.7, 0.1, 0.8, 0.1, 0.5, 0.7, 0.6, 0.8,
0.4, 0.2, 0.9, 0.7, 0.8, 0.5, 0.8, 0.3, 0.3, 1, 0.6, 0.9,
0.9, 0.7, 0.5, 0.1, 1, 0.5, 0.1), j = c(0.5, 0, 0.4, 0.7,
0.5, 0.6, 0.3, 0.9, 0.1, 0.4, 1, 0.5, 0.7, 0.4, 0.3, 0.9,
0, 0.5, 0.7, 0.4, 0, 0.1, 0.4, 0.3, 0.6, 0.9, 0.5, 0.5, 0.9,
0.5, 0.5, 0.3, 0.1, 0.1, 0.9, 0.7), k = c(0.8, 0.1, 0, 0.5,
0.7, 0.9, 0.4, 0.9, 0.3, 0.8, 0.7, 0.8, 0.2, 0.2, 0.8, 0.4,
0.8, 0.7, 0.2, 0.6, 0.5, 0.4, 1, 0.2, 0.7, 0.3, 0.1, 0, 0.4,
0.7, 0.8, 0.3, 0.3, 0.8, 0.5, 0.9), l = c(0.8, 0.6, 0.3,
0.6, 0.6, 0.1, 0.2, 0.3, 0.7, 0.6, 0.1, 0.8, 1, 1, 0.5, 0.3,
0.3, 0, 0.4, 0.1, 0.5, 0.6, 0.9, 0.4, 0.6, 0.4, 0.3, 0.4,
0.5, 0.5, 0.5, 0.5, 0.9, 0.2, 0.5, 0.9), m = c(0.4, 0, 0.7,
0.1, 0.9, 0.8, 0.6, 0.1, 0.2, 0.2, 0.9, 0.9, 0.4, 0.4, 0.9,
0.6, 0.2, 0.1, 0.7, 0.5, 0.2, 0.8, 0.4, 0.6, 0.9, 0.7, 0,
0.8, 0.7, 0.2, 0.1, 0.8, 0.2, 0.2, 0.3, 0.2)), row.names = c(NA,
-36L), class = "data.frame")
我想 运行 在列(即 A-B、A-C、B-C 等)内的行(即 A-B、A-C、B-C 等)之间进行非配对 t 检验 "a",然后是 "b" 内的 A-B、A-C、B-C 等)。有没有一种简单的方法可以用 R 来做到这一点,而不是 运行 单独地每个基因?理想情况下,我想要每个基因中每个样本配对的 p 值输出列表。
非常感谢, L.
我们可以使用 combn
得到 'Class' 列值的成对组合并在 t.test
lstOut <- combn(unique(df1$Class), 2, FUN = function(x) {
dat1 <- subset(df1, Class == x[1])
dat2 <- subset(df1, Class == x[2])
Map(function(x, y) tryCatch(t.test(x, y),
error = function(e) NA), dat1[-1], dat2[-1])
}, simplify = FALSE)
names(lstOut) <- combn(as.character(unique(df1$Class)), 2, FUN = paste, collapse="_")
list
输出可以 tidy
并转换为单个 data.frame
library(broom)
library(purrr)
out <- map_depth(lstOut, .depth = 2, tidy)
out2 <- map_dfr(out, ~bind_rows(.x, .id = 'colname'), .id = 'classCompare')