为成对 t 检验编写函数

Question

我目前正尝试在 R 中编写一个函数，该函数允许我计算数据框中所有可能的成对 t 检验（我知道存在可以实现此目的的函数，但我也想学习如何成功编写函数）。我运行遇到了一个我不知道如何解决的问题。

数据：

library(combinat) # for generating pairwise combinations of variables

apple <- rnorm(100)
banana <- rnorm(100)
pear <- rnorm(100)
orange <- rnorm(100)
pineapple <- rnorm(100)


data <- data.frame(apple, banana, pear, orange, pineapple)

我的想法是使用for循环查找table列名组合中的每一对列名，使用匹配函数引用原始数据集中的关联列号，然后调用关联的列名称作为 t.test 函数中的元素。这个过程是独立工作的，但我运行在尝试迭代它时遇到了问题。

combinations <- combn2(names(data)) # creates a 2x10 table of all the combinations of the 5 column names

a<-match(combinations[8,1],colnames(data))
a<-data[,a]
b<-match(combinations[8,2],colnames(data))
b<-data[,b]
t.test(a, b)

# This works as expected

这是我尝试使用 for 循环自动执行此过程的尝试：

test <- function(data) {
  names <- names(data)
  combinations <- combinat::combn2(names(data))
  num_rows <- NROW(combinations)
  for (i in 1:num_rows) {
    x<- match(combinations[i,1],colnames(data))
    x<-data[,x]
    y<- match(combinations[i,2],colnames(data))
    y<-data[,y]
    t.test(x, y)
  }
}

test(data)
summary(test(data))

结果为空。我显然遗漏了一些东西，但我不确定如何进行。感谢任何帮助。

Answer 1

您需要为 t.test(x, y)

的输出分配一个引用

试试这个：

test <- function(data) {
    names <- names(data)
    combinations <- combinat::combn2(names(data))
    num_rows <- nrow(combinations)
    
    test_results <- vector(mode = "list", length = num_rows)
    for (i in 1:num_rows) {
        x <- match(combinations[i,1],colnames(data))
        x <- data[,x]
        y <- match(combinations[i,2],colnames(data))
        y <- data[,y]
        test_results[[i]] <- t.test(x, y)
    }
    
    return(test_results)
}

这将为您提供一个列表输出，其中每个条目都是根据您的要求对特定字段组合执行的不同 t 检验。

Answer 2

combn（不是combn2）的第三个参数接受一个可以应用于每个组合的函数。你可以简单地做

combn(data, 2L, \(d) {
  syms <- lapply(names(d), as.symbol)
  names(syms) <- c("x", "y")
  eval(bquote(t.test(.(x), .(y)), syms), d)
}, FALSE)

输出

[[1]]

    Welch Two Sample t-test

data:  apple and banana
t = -0.11531, df = 197.6, p-value = 0.9083
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3017470  0.2684074
sample estimates:
  mean of x   mean of y 
-0.03961686 -0.02294705 


[[2]]

    Welch Two Sample t-test

data:  apple and pear
t = -0.78348, df = 197.86, p-value = 0.4343
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3841981  0.1657171
sample estimates:
  mean of x   mean of y 
-0.03961686  0.06962364 


[[3]]

    Welch Two Sample t-test

data:  apple and orange
t = -0.55681, df = 196.65, p-value = 0.5783
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3433412  0.1921482
sample estimates:
  mean of x   mean of y 
-0.03961686  0.03597966 


[[4]]

    Welch Two Sample t-test

data:  apple and pineapple
t = 0.038627, df = 197.99, p-value = 0.9692
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2739606  0.2849074
sample estimates:
  mean of x   mean of y 
-0.03961686 -0.04509027 


[[5]]

    Welch Two Sample t-test

data:  banana and pear
t = -0.64848, df = 196.99, p-value = 0.5174
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3740876  0.1889462
sample estimates:
  mean of x   mean of y 
-0.02294705  0.06962364 


[[6]]

    Welch Two Sample t-test

data:  banana and orange
t = -0.4234, df = 194.84, p-value = 0.6725
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3334116  0.2155582
sample estimates:
  mean of x   mean of y 
-0.02294705  0.03597966 


[[7]]

    Welch Two Sample t-test

data:  banana and pineapple
t = 0.15274, df = 197.7, p-value = 0.8788
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2637425  0.3080290
sample estimates:
  mean of x   mean of y 
-0.02294705 -0.04509027 


[[8]]

    Welch Two Sample t-test

data:  pear and orange
t = 0.25138, df = 197.38, p-value = 0.8018
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2302948  0.2975828
sample estimates:
 mean of x  mean of y 
0.06962364 0.03597966 


[[9]]

    Welch Two Sample t-test

data:  pear and pineapple
t = 0.82024, df = 197.79, p-value = 0.4131
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1610834  0.3905112
sample estimates:
  mean of x   mean of y 
 0.06962364 -0.04509027 


[[10]]

    Welch Two Sample t-test

data:  orange and pineapple
t = 0.59521, df = 196.45, p-value = 0.5524
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1875381  0.3496780
sample estimates:
  mean of x   mean of y 
 0.03597966 -0.04509027 


[[1]]

    Welch Two Sample t-test

data:  apple and banana
t = -0.11531, df = 197.6, p-value = 0.9083
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3017470  0.2684074
sample estimates:
  mean of x   mean of y 
-0.03961686 -0.02294705 


[[2]]

    Welch Two Sample t-test

data:  apple and pear
t = -0.78348, df = 197.86, p-value = 0.4343
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3841981  0.1657171
sample estimates:
  mean of x   mean of y 
-0.03961686  0.06962364 


[[3]]

    Welch Two Sample t-test

data:  apple and orange
t = -0.55681, df = 196.65, p-value = 0.5783
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3433412  0.1921482
sample estimates:
  mean of x   mean of y 
-0.03961686  0.03597966 


[[4]]

    Welch Two Sample t-test

data:  apple and pineapple
t = 0.038627, df = 197.99, p-value = 0.9692
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2739606  0.2849074
sample estimates:
  mean of x   mean of y 
-0.03961686 -0.04509027 


[[5]]

    Welch Two Sample t-test

data:  banana and pear
t = -0.64848, df = 196.99, p-value = 0.5174
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3740876  0.1889462
sample estimates:
  mean of x   mean of y 
-0.02294705  0.06962364 


[[6]]

    Welch Two Sample t-test

data:  banana and orange
t = -0.4234, df = 194.84, p-value = 0.6725
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3334116  0.2155582
sample estimates:
  mean of x   mean of y 
-0.02294705  0.03597966 


[[7]]

    Welch Two Sample t-test

data:  banana and pineapple
t = 0.15274, df = 197.7, p-value = 0.8788
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2637425  0.3080290
sample estimates:
  mean of x   mean of y 
-0.02294705 -0.04509027 


[[8]]

    Welch Two Sample t-test

data:  pear and orange
t = 0.25138, df = 197.38, p-value = 0.8018
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2302948  0.2975828
sample estimates:
 mean of x  mean of y 
0.06962364 0.03597966 


[[9]]

    Welch Two Sample t-test

data:  pear and pineapple
t = 0.82024, df = 197.79, p-value = 0.4131
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1610834  0.3905112
sample estimates:
  mean of x   mean of y 
 0.06962364 -0.04509027 


[[10]]

    Welch Two Sample t-test

data:  orange and pineapple
t = 0.59521, df = 196.45, p-value = 0.5524
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1875381  0.3496780
sample estimates:
  mean of x   mean of y 
 0.03597966 -0.04509027 


[[1]]

    Welch Two Sample t-test

data:  apple and banana
t = -0.11531, df = 197.6, p-value = 0.9083
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3017470  0.2684074
sample estimates:
  mean of x   mean of y 
-0.03961686 -0.02294705 


[[2]]

    Welch Two Sample t-test

data:  apple and pear
t = -0.78348, df = 197.86, p-value = 0.4343
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3841981  0.1657171
sample estimates:
  mean of x   mean of y 
-0.03961686  0.06962364 


[[3]]

    Welch Two Sample t-test

data:  apple and orange
t = -0.55681, df = 196.65, p-value = 0.5783
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3433412  0.1921482
sample estimates:
  mean of x   mean of y 
-0.03961686  0.03597966 


[[4]]

    Welch Two Sample t-test

data:  apple and pineapple
t = 0.038627, df = 197.99, p-value = 0.9692
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2739606  0.2849074
sample estimates:
  mean of x   mean of y 
-0.03961686 -0.04509027 


[[5]]

    Welch Two Sample t-test

data:  banana and pear
t = -0.64848, df = 196.99, p-value = 0.5174
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3740876  0.1889462
sample estimates:
  mean of x   mean of y 
-0.02294705  0.06962364 


[[6]]

    Welch Two Sample t-test

data:  banana and orange
t = -0.4234, df = 194.84, p-value = 0.6725
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3334116  0.2155582
sample estimates:
  mean of x   mean of y 
-0.02294705  0.03597966 


[[7]]

    Welch Two Sample t-test

data:  banana and pineapple
t = 0.15274, df = 197.7, p-value = 0.8788
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2637425  0.3080290
sample estimates:
  mean of x   mean of y 
-0.02294705 -0.04509027 


[[8]]

    Welch Two Sample t-test

data:  pear and orange
t = 0.25138, df = 197.38, p-value = 0.8018
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2302948  0.2975828
sample estimates:
 mean of x  mean of y 
0.06962364 0.03597966 


[[9]]

    Welch Two Sample t-test

data:  pear and pineapple
t = 0.82024, df = 197.79, p-value = 0.4131
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1610834  0.3905112
sample estimates:
  mean of x   mean of y 
 0.06962364 -0.04509027 


[[10]]

    Welch Two Sample t-test

data:  orange and pineapple
t = 0.59521, df = 196.45, p-value = 0.5524
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1875381  0.3496780
sample estimates:
  mean of x   mean of y 
 0.03597966 -0.04509027

为成对 t 检验编写函数

Writing a function for pairwise t-tests

for-loop

r

function

t-test