lm() 和 t.test(var.equal = TRUE) 在一台机器上不同,但在另一台机器上不同。可能的原因?

lm() and t.test(var.equal = TRUE) differ on one machine but not on another. Possible reasons?

我注意到 lm() 的奇怪行为,更具体地说 t.values 没有解决。这种行为只能在我的机器上观察到,不考虑全局环境中加载的 packages/objects。 运行 来自 help(t.test) 的示例:

t.test(extra ~ group, data = sleep, var.equal = TRUE)

产生以下结果:

## 
##  Two Sample t-test
## 
## data:  extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.363874  0.203874
## sample estimates:
## mean in group 1 mean in group 2 
##            0.75            2.33

虽然 "same" 是 lm():

summary(lm(extra ~ group, data = sleep))

产量:

## 
## Call:
## lm(formula = extra ~ group, data = sleep)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.0095  -0.1152   1.3117   3.4194  11.4571 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   -1.488      2.028  -0.734   0.4725  
## group2         4.962      2.147   2.311   0.0329 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.823 on 18 degrees of freedom
## Multiple R-squared:  0.3699, Adjusted R-squared:  0.3349 
## F-statistic: 10.57 on 1 and 18 DF,  p-value: 0.00444

t.test t.value: -1.8608 vs. lm t.value: 2.311 什么是这种描述的可能原因?

此代码在 Rmarkdown 中 运行(因此在新的 session 中),事先没有任何其他代码 运行。

Session 信息

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-solus-linux-gnu (64-bit)
## Running under: Solus 4.0 Fortitude
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/haswell/libopenblas_haswellp-r0.3.2.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.6.1  magrittr_1.5    tools_3.6.1     htmltools_0.4.0
##  [5] yaml_2.2.0      Rcpp_1.0.3      stringi_1.4.3   rmarkdown_1.17 
##  [9] knitr_1.25      stringr_1.4.0   xfun_0.10       digest_0.6.23  
## [13] rlang_0.4.2     evaluate_0.14

其他"Machine"

我在 docker 容器(rocker/verse:3.6.1,与我的机器 R-Version 相同)中尝试了同样的事情,结果是 consistent 结果:

t.test(extra ~ group, data = sleep, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.363874  0.203874
## sample estimates:
## mean in group 1 mean in group 2 
##            0.75            2.33

summary(lm(extra ~ group, data = sleep))
## 
## Call:
## lm(formula = extra ~ group, data = sleep)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -2.430 -1.305 -0.580  1.455  3.170 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   0.7500     0.6004   1.249   0.2276  
## group2        1.5800     0.8491   1.861   0.0792 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.899 on 18 degrees of freedom
## Multiple R-squared:  0.1613, Adjusted R-squared:  0.1147 
## F-statistic: 3.463 on 1 and 18 DF,  p-value: 0.07919

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 9 (stretch)
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.6.1  magrittr_1.5    tools_3.6.1     htmltools_0.4.0
##  [5] yaml_2.2.0      Rcpp_1.0.3      stringi_1.4.3   rmarkdown_1.16 
##  [9] knitr_1.25      stringr_1.4.0   xfun_0.10       digest_0.6.22  
## [13] rlang_0.4.1     evaluate_0.14

据我所知,唯一的区别是 BLAS/LAPACK 版本。

我最近在设置 SolusOS 时遇到同样的问题 and also posted about it here。对我来说,OpenBLAS 库似乎是罪魁祸首,即 libopenblas_haswellp-r0.3.2.so。一旦将库更改为另一个库,就我而言 libopenblas_core2p-r0.3.2.so,我开始在我的 SolusOS 设置上获得正确的结果。

(实际上我只是编辑了自己的 post 以包含此信息)