如何在 R 中的矩阵数组上使用 apply()?

How to use apply() on an array of matrices in R?

我正在尝试对矩阵数组使用 apply()。 这是一个例子:

data(UCBAdmissions)
fisher.test(UCBAdmissions[,,1]) #This works great
apply(UCBAdmissions, c(1,2,3), fisher.test) #This fails

像这样: 我个人是这样做的: 先列个清单UCB_list

然后使用 rbindlistdata.table

将列表元素绑定到数据框

最后,使用 lapply 指示要遍历的列 y=df$Gender

library(data.table)
UCB_list <- list(UCBAdmissions)
df <- rbindlist(lapply(UCB_list, data.frame))

lapply(df, fisher.test, y = df$Gender)
> lapply(df, fisher.test, y = df$Gender)
$Admit

    Fisher's Exact Test for Count Data

data:  X[[i]] and df$Gender
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.1537975 6.5020580
sample estimates:
odds ratio 
         1 


$Gender

    Fisher's Exact Test for Count Data

data:  X[[i]] and df$Gender
p-value = 7.396e-07
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 16.56459      Inf
sample estimates:
odds ratio 
       Inf 


$Dept

    Fisher's Exact Test for Count Data

data:  X[[i]] and df$Gender
p-value = 1
alternative hypothesis: two.sided


$Freq

    Fisher's Exact Test for Count Data

data:  X[[i]] and df$Gender
p-value = 0.4783
alternative hypothesis: two.sided

UCBAdmissions数据在Dept部分有6个偶然table数据,分别是:“A”、“B”、“C”、“D”、“E” , 和 "F".

dimnames(UCBAdmissions)
#$Admit
#[1] "Admitted" "Rejected"

#$Gender
#[1] "Male"   "Female"

#$Dept
#[1] "A" "B" "C" "D" "E" "F"

您可以将 fisher.test 应用于这六个表格中的每一个。我不清楚你的代码apply(UCBAdmissions, c(1,2,3), fisher.test)你想应用六个表中的哪一部分fisher.test

如果要将fisher.test应用于六个表中的前三个,即“A”、“B”和“C”,则需要先对UCBAdmissions数据进行子集化, 然后将维度设置为 3.

apply(UCBAdmissions[,,1:3], 3, fisher.test)

# $A
# 
# Fisher's Exact Test for Count Data
# 
# data:  array(newX[, i], d.call, dn.call)
# p-value = 1.669e-05
# alternative hypothesis: true odds ratio is not equal to 1
# 95 percent confidence interval:
#  0.1970420 0.5920417
# sample estimates:
# odds ratio 
#  0.3495628 
# 
# 
# $B
# 
#   Fisher's Exact Test for Count Data
# 
# data:  array(newX[, i], d.call, dn.call)
# p-value = 0.6771
# alternative hypothesis: true odds ratio is not equal to 1
# 95 percent confidence interval:
#   0.2944986 2.0040231
# sample estimates:
#   odds ratio 
# 0.8028124 
# 
# 
# $C
# 
# Fisher's Exact Test for Count Data
# 
# data:  array(newX[, i], d.call, dn.call)
# p-value = 0.3866
# alternative hypothesis: true odds ratio is not equal to 1
# 95 percent confidence interval:
#  0.8452173 1.5162918
# sample estimates:
# odds ratio 
#     1.1329 

另一种选择是将 3 替换为维度名称:

apply(UCBAdmissions[,,1:3], "Dept", fisher.test)

这将给出与前面代码完全相同的结果。

在另一种情况下,如果您想将 fisher.test 应用到 AdmitDept 之间的列联表中,对于“A”、“B”、“C”,按 Gender,你可以使用:

apply(UCBAdmissions[,,1:3], "Gender", fisher.test)

# $Male
# 
# Fisher's Exact Test for Count Data
# 
# data:  array(newX[, i], d.call, dn.call)
# p-value = 7.217e-16
# alternative hypothesis: two.sided
# 
# 
# $Female
# 
#   Fisher's Exact Test for Count Data
# 
# data:  array(newX[, i], d.call, dn.call)
# p-value < 2.2e-16
# alternative hypothesis: two.sided

为了更清楚地显示正在测试的部分,我对数据进行整形,然后对其进行过滤,以便我在 A、B 和 C 部门只有男性学生。然后,我将 fisher.test 应用于数据

DF <- UCBAdmissions %>% 
      as.data.frame %>% 
      filter(Gender == "Male", 
             Dept == "A" | Dept == "B" | Dept == "C") %>%
      pivot_wider(-Gender, names_from = Admit, values_from = Freq) 
DF
# # A tibble: 3 x 3
# Dept  Admitted Rejected
# <fct>    <dbl>    <dbl>
#   1 A          512      313
# 2 B          353      207
# 3 C          120      205

fisher.test(DF[1:3, 2:3])
# 
# Fisher's Exact Test for Count Data
# 
# data:  DF[1:3, 2:3]
# p-value = 7.217e-16
# alternative hypothesis: two.sided

结果与 apply(UCBAdmissions[,,1:3], "Gender", fisher.test) for Male 组的结果完全相同。