如何编写 Anderson-Darling 测试 p 值循环？

Question

我有一个名为 matrix_1 的矩阵：

    c1  c2  c3  c4  c5
R1  27  38  94  40  4
R2  69  16  85  2   15
R3  30  35  64  95  6
R4  20  33  77  98  55
R5  20  44  60  33  89
R6  12  88  87  44  38

我想运行循环中的 Anderson-Darling 测试 (ad.test())，以将每列的分布与向量 vector_a 进行比较。我希望函数只是 return 来自 version 1 的 p 值。这是与 vector_a:

相比仅使用一列的示例输出

T.AD = ( Anderson-Darling  Criterion - mean)/sigma

Null Hypothesis: All samples come from a common population.

             AD  T.AD  asympt. P-value
version 1: 12.9 15.72        2.416e-07
version 2: 12.9 15.76        2.371e-07

我正在尝试这个：

sapply(1:ncol(matrix_1), function(i) ad.test(as.vector(matrix_1[,1:i]), vector_a)$p)

但它使 cpu 超载，我没有得到结果。

Answer 1

这是识别您正在使用的包的好方法

library(kSamples)

测试结果在$ad。版本 1 是第一行。 P 值是第三列，因此您可以使用

捕获它

'output'$ad[1,3]

使用样本向量，并设置矩阵数据

vector_a <- sample(0:100, 6)
rownames <- paste0("R", seq(1,6))
colnames <- paste0("C", seq(1,5))
matrix_1 <- matrix(
c(27,  38,  94,  40,  4,
69,  16,  85,  2,   15,
30,  35,  64,  95,  6,
20,  33,  77,  98,  55,
20,  44,  60,  33,  89,
12,  88,  87,  44,  38),
nrow = 6, ncol = 5, , dimnames = list(rownames, colnames))

您可以使用 apply 函数，指定“2”来迭代列

apply(matrix_1, 2, function(matrix_column) ad.test(as.vector(matrix_column), vector_a)$ad[1,3])

为每列给出版本 1 p 值

     C1      C2      C3      C4      C5 
0.12623 0.02507 0.39935 0.81181 0.28477

编辑以解决有关一步功能的评论 matrix_column 是函数的参数名称。它可以是您想要的任何名称。以下是分成几部分的答案：

# Define function
ad_function <- function(matrix_column){
  ad_test_results <- ad.test(as.vector(matrix_column), vector_a) # ad.test comparing matrix_column (columns of matrix) and vector_a. Assign results to ad_test_results
  ad_test_results$ad[1,3] # This gets the p-value for version 1
}

# Now apply the matrix columns to the function
apply(matrix_1, 2, ad_function)

如何编写 Anderson-Darling 测试 p 值循环？

How to write Anderson-Darling Test p-values loop?

r

sapply

p-value