Julia:Contingency Table 和 Fisher 对 DataFrames 的精确检验
Julia: Contingency Table and Fisher's Exact Test with DataFrames
当偶然事件 table 保存为 DataFrame 时,编写 Fisher 精确检验的最优雅方法是什么?
与 R 和 Python 中的类似计算相比,我的 Julia 代码感觉做作。
朱莉娅代码:
using DataFrames
using HypothesisTests
df = DataFrame(index=["Died", "Survived"], Treatment=[39,30961], Control=[63, 30937])
ft = FisherExactTest(df[df.index .== "Died", :].Treatment[1],
df[df.index .== "Died", :].Control[1],
df[df.index .== "Survived", :].Treatment[1],
df[df.index .== "Survived", :].Control[1])
# Fisher's exact test
# -------------------
# Population details:
# parameter of interest: Odds ratio
# value under h_0: 1.0
# point estimate: 0.618572
# 95% confidence interval: (0.4038, 0.9371)
# Test summary:
# outcome with 95% confidence: reject h_0
# two-sided p-value: 0.0222
# Details:
# contingency table:
# 39 63
# 30961 30937
pvalue(ft; tail = :left)
# 0.011094091841433727
R代码:
df <- data.frame(
"Treatment" = c(39, 30961),
"Control" = c(63, 30937),
row.names = c("Died", "Survived"),
stringsAsFactors = FALSE
)
fisher.test(df, alternative="less")
# Fisher's Exact Test for Count Data
# data: df
# p-value = 0.01109
# alternative hypothesis: true odds ratio is less than 1
# 95 percent confidence interval:
# 0.000000 0.880098
# sample estimates:
# odds ratio
# 0.6185762
Python代码:
import pandas as pd
from scipy.stats import fisher_exact
df = pd.DataFrame([[39, 63],[30961, 30937]],
columns=["Treatment", "Control"],
index=["Died", "Survived"])
fisher_exact(df, "less")
# (0.6185677526719483, 0.011094091844052023)
例如,您可以这样写:
FisherExactTest(Matrix(df[1:2, 2:3])...)
当偶然事件 table 保存为 DataFrame 时,编写 Fisher 精确检验的最优雅方法是什么?
与 R 和 Python 中的类似计算相比,我的 Julia 代码感觉做作。
朱莉娅代码:
using DataFrames
using HypothesisTests
df = DataFrame(index=["Died", "Survived"], Treatment=[39,30961], Control=[63, 30937])
ft = FisherExactTest(df[df.index .== "Died", :].Treatment[1],
df[df.index .== "Died", :].Control[1],
df[df.index .== "Survived", :].Treatment[1],
df[df.index .== "Survived", :].Control[1])
# Fisher's exact test
# -------------------
# Population details:
# parameter of interest: Odds ratio
# value under h_0: 1.0
# point estimate: 0.618572
# 95% confidence interval: (0.4038, 0.9371)
# Test summary:
# outcome with 95% confidence: reject h_0
# two-sided p-value: 0.0222
# Details:
# contingency table:
# 39 63
# 30961 30937
pvalue(ft; tail = :left)
# 0.011094091841433727
R代码:
df <- data.frame(
"Treatment" = c(39, 30961),
"Control" = c(63, 30937),
row.names = c("Died", "Survived"),
stringsAsFactors = FALSE
)
fisher.test(df, alternative="less")
# Fisher's Exact Test for Count Data
# data: df
# p-value = 0.01109
# alternative hypothesis: true odds ratio is less than 1
# 95 percent confidence interval:
# 0.000000 0.880098
# sample estimates:
# odds ratio
# 0.6185762
Python代码:
import pandas as pd
from scipy.stats import fisher_exact
df = pd.DataFrame([[39, 63],[30961, 30937]],
columns=["Treatment", "Control"],
index=["Died", "Survived"])
fisher_exact(df, "less")
# (0.6185677526719483, 0.011094091844052023)
例如,您可以这样写:
FisherExactTest(Matrix(df[1:2, 2:3])...)