为什么 scipy return 中的 χ² 检验是较小的检验统计量？

Question

我正在计算小并发的 chi2 测试统计量 table:

obs = np.array([
    [652, 576],
    [1348, 924]
])

当我手动计算时，如 Wikipedia (Σ (Oᵢ - Eᵢ)² / Eᵢ) 所示，我得到的结果是 ~12.660，但是 scipy.stats.chi2_contingency 函数 returns 这些结果与另一个测试统计数据：

>>> scipy.stats.chi2_contingency(obs)
 (12.40676502094132, 0.00042778128638335943, 1, array([[  701.71428571,  526.28571429],
   [ 1298.28571429,   973.71428571]]))

我将结果的预期频率与我的结果进行了比较，结果完全相同。此外，通过将我的数据输入在线计算器，我得到的结果与我自己的结果相同（例如 http://www.socscistatistics.com/tests/chisquare2/default2.aspx）。

这个函数有什么神奇的作用来减少测试统计量？

Answer 1

默认情况下 correction 为 True，这意味着 Yates 的连续性校正适用于自由度为 1 的情况（如此处的情况）。如果您设置 correction=False 这不会发生，您将得到 12.660... 作为测试统计数据：

>>> scipy.stats.chi2_contingency(obs, correction=False)
(12.660142450795965,
 0.00037353375362753034,
 1,
 array([[  701.71428571,   526.28571429],
        [ 1298.28571429,   973.71428571]])

documentation 为 correction 参数提供了以下更多信息并总结了 Yates 的更正：

If True, and the degrees of freedom is 1, apply Yates’ correction for continuity. The effect of the correction is to adjust each observed value by 0.5 towards the corresponding expected value.

为什么 scipy return 中的 χ² 检验是较小的检验统计量？

Why does a χ² test in scipy return a lesser test statistic?

python

scipy

chi-squared

python-3.x