如何更改卡方检验中的自由度

Question

我正在尝试使用泊松分布的拟合优度计算 P 值

观察到的数据点是：118 64 18，预期是：120 61.25 18.8

我用泊松分布计算了概率，所以 df 值为 3-1-1=1

我从 R 得到 df=4

这是我在 R 中输入的内容：

Chi.Observed <- c(118,64,18)
Chi.Expected <- c(120,61.2,18.8)
chisq.test(Chi.Observed, Chi.Expected)

答案是：

        Pearson's Chi-squared test
Chi.Observed and Chi.Expected
X-squared = 6, df = 4, p-value = 0.1991

Answer 1

稍后我将展示如何更改测试，但这里有一些问题。（除了 df 调整之外，this CrossValidated question 涵盖了与这个答案完全相同的内容，还有一点......）

了解有关您如何得出预期计数的更多信息会有所帮助。重建：
- dpois(0:1,lambda=0.51)*200 给出 (120.09912,61.25055) 并且 ppois(1,lambda=0.51,lower.tail=FALSE) 给出 18.6，所以我假设你这里有 0、1 和 >= 2 的概率从 200 计数计数
- sum(Chi.Observed) 是 200，sum((0:2)*Chi.Observed/sum(Chi.Observed)) 是 0.5，所以非常吻合。
所以你从3个数值中推导出了2条信息来生成你的期望值，你的df应该是1似乎是合理的。
指定 x 和 y 并没有按照您的想法（或我认为的）去做：@Dave2e 指出out，你真正想要的是指定 p 代替。

if ‘x’ is a vector and ‘y’ is not given, then a goodness-of-fit test is performed ... the hypothesis tested is whether the population probabilities equal those in ‘p’, or are all equal if ‘p’ is not given.

破解测试的方法如下：

Chi.Observed <- c(118,64,18)
Chi.Expected <- c(120,61.2,18.8)
cc <- chisq.test(Chi.Observed, 
         p = Chi.Expected/sum(Chi.Expected))
cc$parameter <- c(df=1)
cc$p.value <- pchisq(cc$statistic,df=cc$parameter,
      lower.tail=FALSE)
cc 
## Pearson's Chi-squared test    
## data:  Chi.Observed and Chi.Expected
## X-squared = 0.19548, df = 1, p-value = 0.6584

查看当 x 和 y 都作为向量给出时 实际上 发生的代码：R 构造此 table

table(factor(Chi.Expected), factor(Chi.Observed))

       18 64 118
  18.8  1  0   0
  61.2  0  1   0
  120   0  0   1

然后对其进行权变table分析（即检验row/column独立性的零假设）！这是我很久以来见过的最好的 R 陷阱之一 ...

Answer 2

在思考这个问题并阅读上面Ben的回答后，我相信我有一个解释and/or答案。这个问题有两个方面，使用正确形式的 Chisq 测试并获得正确的自由度。

第一个问题在使用正确的形式chisq.test。如果您使用以下形式：chisq.test(x, y) 这将导致创建 3x3 意外事件 table 并导致 p 值过低。
请参见下面的测试 1。 test1$observed 和 test1$expected 没有返回正确的输入。

的正确格式是 chisq.test(x, p) #where p is the expected probability of x.
这显示为下面的 test2。现在，p 值已从 19% 变为 90%。（这将是我的答案，但我会听从更好的统计学家。）

要将自由度调整为 1，请参阅 Ben Bolker 的回答。现在结果显示为 test3，p 值为 66%

希望这提供了一个接受table解释。

Chi.Observed <- c(118,64,18)
Chi.Expected <- c(120,61.2,18.8)

test1<-chisq.test(Chi.Observed, Chi.Expected) # this is 3x3 contgency table.
test1
# Pearson's Chi-squared test
# 
# data:  Chi.Observed and Chi.Expected
# X-squared = 6, df = 4, p-value = 0.1991
# 
#This result is incorrect as it...
# forms a 3x3 contingency table as shown by: 
test1$observed   # observed counts 
test1$expected   # expected counts under the null


#chisq using the expected probabilities:
test2<-chisq.test(Chi.Observed, p= Chi.Expected/sum(Chi.Expected))
test2
# Chi-squared test for given probabilities
# 
# data:  Chi.Observed
# X-squared = 0.19548, df = 2, p-value = 0.9069


#adjust degrees of freedon as per Ben's answer
test3 <- chisq.test(Chi.Observed,  p = Chi.Expected/sum(Chi.Expected))
test3$parameter <- c(df=1)
test3$p.value <- pchisq(test3$statistic, df=test3$parameter, lower.tail=FALSE)
test3 
# Chi-squared test for given probabilities
# 
# data:  Chi.Observed
# X-squared = 0.19548, df = 1, p-value = 0.6584

如何更改卡方检验中的自由度

How to change the degrees of freedom in a chi-square test

statistics

r

p-value