Julia：Contingency Table 和 Fisher 对 DataFrames 的精确检验

Question

当偶然事件 table 保存为 DataFrame 时，编写 Fisher 精确检验的最优雅方法是什么？

与 R 和 Python 中的类似计算相比，我的 Julia 代码感觉做作。

朱莉娅代码：

using DataFrames
using HypothesisTests
df = DataFrame(index=["Died", "Survived"], Treatment=[39,30961], Control=[63, 30937])
ft = FisherExactTest(df[df.index .== "Died", :].Treatment[1],
                     df[df.index .== "Died", :].Control[1],
                     df[df.index .== "Survived", :].Treatment[1],
                     df[df.index .== "Survived", :].Control[1])

# Fisher's exact test
# -------------------
# Population details:
#     parameter of interest:   Odds ratio
#     value under h_0:         1.0
#     point estimate:          0.618572
#     95% confidence interval: (0.4038, 0.9371)

# Test summary:
#     outcome with 95% confidence: reject h_0
#     two-sided p-value:           0.0222

# Details:
#     contingency table:
#            39     63
#         30961  30937

pvalue(ft; tail = :left)
# 0.011094091841433727

R代码：


df <- data.frame(
  "Treatment" = c(39, 30961),
  "Control" = c(63, 30937),
  row.names = c("Died", "Survived"),
  stringsAsFactors = FALSE
)

fisher.test(df, alternative="less")
#   Fisher's Exact Test for Count Data
# data:  df
# p-value = 0.01109
# alternative hypothesis: true odds ratio is less than 1
# 95 percent confidence interval:
#  0.000000 0.880098
# sample estimates:
# odds ratio 
#  0.6185762

Python代码：

import pandas as pd
from scipy.stats import fisher_exact

df = pd.DataFrame([[39, 63],[30961, 30937]],
                  columns=["Treatment", "Control"],
                  index=["Died", "Survived"])

fisher_exact(df, "less") 
# (0.6185677526719483, 0.011094091844052023)

Answer 1

例如，您可以这样写：

FisherExactTest(Matrix(df[1:2, 2:3])...)

Julia：Contingency Table 和 Fisher 对 DataFrames 的精确检验

Julia: Contingency Table and Fisher's Exact Test with DataFrames

julia

dataframes.jl