卡方值 v.s。 Fisher 对 2×2 卡方检验的精确概率 table

Chi squared value v.s. Fisher's exact probability on the Chi squared test on 2×2 table

Box1的源代码,用R语言编写,应满足以下规范;

  • The data are output in a matrix.
  • The values in columns 2 through 4 were chosen so that the marginal total are constant when fitted to a 2×2 table. (The marginal total are predetermined by n_A, n_B, n_T and n_F.)
  • Chi-square values, calculated from the 2-5th column's value, are listed in the 6th column.
  • Fisher's exact probability, calculated from the 2-5th column's value, are listed in the 6th column.

从上面的角度来看,有一些关注点,但我想先关注以下我的问题。

我的问题

In addition to the above functionality, I would like to add the following specifications;

  • Sorting all columns by the 6th column's value, so as to value of the 6th column value increase as you go down.
  • The values in column 8 should be the values in column 7 added together from the top.
  • Then, write a graph with column 6 on the x-axis and column 7 on the y-axis.

In other words, we want the the table and graph like following table 1 and Fig.1, from the result of Box1's code.

现在,我每次更改设置时都在 Excel 中执行此过程,但是否可以在 R 中完成它?

表 1


图1

Box1

#Function to caluculate ln(n!)
ln_fact<-function(n){
  if (n==0){ans=0}else{
    ans=0
    for(i in 1:n) {ans=ans+log(i)}}
  return(ans)
}

#Fuction to caluculate chiq2 value
chiq_2by2<-function(TA,TB,FA,FB){
  nA=TA+FA;nB=TB+FB; ntot=nA+nB
  nF=FA+FB;nT=TA+TB
  ETA=(nT*nA)/ntot;EFA=(nF*nA)/ntot
  ETB=(nT*nB)/ntot;  EFB=(nF*nB)/ntot
  
  ch=((TA-ETA)^2)/(ETA);ch=ch+((TB-ETB)^2)/(ETB)
  ch=ch+((FA-EFA)^2)/(EFA);ch=ch+((FB-EFB)^2)/(EFB)
  return(ch)
}

#main part
##Set marginal total of 2×2.
n_A=14
n_B=6
n_T=13
n_F=n_A+n_B-n_T

##part1 of probability of occurrence
lnop1=ln_fact(n_A)+ ln_fact(n_B)+ln_fact(n_T)+ln_fact(n_F) - ln_fact(n_A+n_B)  


cnt=0;
A_tot=n_A; B_tot=n_B
resul=0
for(i in 0:A_tot){
  for(j in 0:B_tot){
##Calculating the elements of a 2×2 table.
    TA=i;  FA=A_tot-TA
    TB=j;    FB=B_tot-TB

## judging whether or not the elements of a 2×2 are well-defined.
    br1<-(TA+TB==n_T);br2<-(FA+FB==n_F)
    br3<-(TA+FA==n_A);br4<-(TB+FB==n_B)
    br=br1*br2*br3*br4
## To calculate the chi-square value and Fisher's direct probability for the well-defined conditions   
    if (br==1){
      cnt=cnt+1
###ln(probability of occurrence), probability of occurrence is based on the Fisher's direct probability
      lnop=lnop1-(ln_fact(TA)+ ln_fact(TB)+ln_fact(FA)+ln_fact(FB))  
      
      pr=c(cnt,TA,TB,FA,FB,chiq_2by2(TA,TB,FA,FB),exp(lnop), ) #★1
      resul <- rbind(resul, pr)
    }
  }
}

resul


这不是一个非常聪明的代码,但我自己想出来了。 “#answer of this question”下面的部分是必不可少的部分;这部分旨在对矩阵的所有行进行排序,以便第 6 列的值随着您的下降而增加。并为第 6 列与第 7 列绘制图表 其他部分与问题Box1中描述的代码相同。

盒子

#Function to caluculate ln(n!)
ln_fact<-function(n){
  if (n==0){ans=0}else{
    ans=0
    for(i in 1:n) {ans=ans+log(i)}}
  return(ans)
}


#Fuction to caluculate chiq2 value
chiq_2by2<-function(TA,TB,FA,FB){
  nA=TA+FA;nB=TB+FB; ntot=nA+nB
  nF=FA+FB;nT=TA+TB
  ETA=(nT*nA)/ntot;EFA=(nF*nA)/ntot
  ETB=(nT*nB)/ntot;  EFB=(nF*nB)/ntot
  
  ch=((TA-ETA)^2)/(ETA);ch=ch+((TB-ETB)^2)/(ETB)
  ch=ch+((FA-EFA)^2)/(EFA);ch=ch+((FB-EFB)^2)/(EFB)
  return(ch)
}

#main part
##Set marginal total of 2×2.
n_A=14
n_B=6
n_T=13
n_F=n_A+n_B-n_T

##part1 of probability of occurrence
lnop1=ln_fact(n_A)+ ln_fact(n_B)+ln_fact(n_T)+ln_fact(n_F) - ln_fact(n_A+n_B)  


cnt=0;
A_tot=n_A; B_tot=n_B
resul=0
for(i in 0:A_tot){
  for(j in 0:B_tot){
    ##Calculating the elements of a 2×2 table.
    TA=i;  FA=A_tot-TA
    TB=j;    FB=B_tot-TB
    
    ## judging whether or not the elements of a 2×2 are well-defined.
    br1<-(TA+TB==n_T);br2<-(FA+FB==n_F)
    br3<-(TA+FA==n_A);br4<-(TB+FB==n_B)
    br=br1*br2*br3*br4
    ## To calculate the chi-square value and Fisher's direct probability for the well-defined conditions   
    if (br==1){
      cnt=cnt+1
      ###ln(probability of occurrence), probability of occurrence is based on the Fisher's direct probability
      lnop=lnop1-(ln_fact(TA)+ ln_fact(TB)+ln_fact(FA)+ln_fact(FB))  
      
      pr=c(cnt,TA,TB,FA,FB,chiq_2by2(TA,TB,FA,FB),exp(lnop),0) #★1
      resul <- rbind(resul, pr)
    }
  }
}

#answer of this question
rownames(resul) <- NULL
dat=resul

#↓If you do not want the point (0,0) to appear on the graph, comment-out it.
#dat[1,1]=NA;dat=na.omit(dat)

dat=dat[order(dat[,6]),]

dat[1,8]=dat[1,7]
for(i in 2:length(dat[,8])){dat[i,8]=dat[i-1,8]+dat[i,7]}
dat

plot(dat[,7] ~ dat[,6],col = "red")
curve(dchisq(x,1),col="red",add=T)
#par(new=T) 
#plot(dat[,8] ~ dat[,6])

作为结果,我们可以得到如下Table和图表。

Table

图表