lm() 和 t.test(var.equal = TRUE) 在一台机器上不同,但在另一台机器上不同。可能的原因?
lm() and t.test(var.equal = TRUE) differ on one machine but not on another. Possible reasons?
我注意到 lm()
的奇怪行为,更具体地说 t.values 没有解决。这种行为只能在我的机器上观察到,不考虑全局环境中加载的 packages/objects。
运行 来自 help(t.test)
的示例:
t.test(extra ~ group, data = sleep, var.equal = TRUE)
产生以下结果:
##
## Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.363874 0.203874
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
虽然 "same" 是 lm()
:
summary(lm(extra ~ group, data = sleep))
产量:
##
## Call:
## lm(formula = extra ~ group, data = sleep)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.0095 -0.1152 1.3117 3.4194 11.4571
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.488 2.028 -0.734 0.4725
## group2 4.962 2.147 2.311 0.0329 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.823 on 18 degrees of freedom
## Multiple R-squared: 0.3699, Adjusted R-squared: 0.3349
## F-statistic: 10.57 on 1 and 18 DF, p-value: 0.00444
t.test t.value: -1.8608 vs. lm t.value: 2.311 什么是这种描述的可能原因?
此代码在 Rmarkdown 中 运行(因此在新的 session 中),事先没有任何其他代码 运行。
Session 信息
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-solus-linux-gnu (64-bit)
## Running under: Solus 4.0 Fortitude
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/haswell/libopenblas_haswellp-r0.3.2.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.6.1 magrittr_1.5 tools_3.6.1 htmltools_0.4.0
## [5] yaml_2.2.0 Rcpp_1.0.3 stringi_1.4.3 rmarkdown_1.17
## [9] knitr_1.25 stringr_1.4.0 xfun_0.10 digest_0.6.23
## [13] rlang_0.4.2 evaluate_0.14
其他"Machine"
我在 docker 容器(rocker/verse:3.6.1,与我的机器 R-Version 相同)中尝试了同样的事情,结果是 consistent 结果:
t.test(extra ~ group, data = sleep, var.equal = TRUE)
##
## Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.363874 0.203874
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
summary(lm(extra ~ group, data = sleep))
##
## Call:
## lm(formula = extra ~ group, data = sleep)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.430 -1.305 -0.580 1.455 3.170
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.7500 0.6004 1.249 0.2276
## group2 1.5800 0.8491 1.861 0.0792 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.899 on 18 degrees of freedom
## Multiple R-squared: 0.1613, Adjusted R-squared: 0.1147
## F-statistic: 3.463 on 1 and 18 DF, p-value: 0.07919
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 9 (stretch)
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.6.1 magrittr_1.5 tools_3.6.1 htmltools_0.4.0
## [5] yaml_2.2.0 Rcpp_1.0.3 stringi_1.4.3 rmarkdown_1.16
## [9] knitr_1.25 stringr_1.4.0 xfun_0.10 digest_0.6.22
## [13] rlang_0.4.1 evaluate_0.14
据我所知,唯一的区别是 BLAS/LAPACK 版本。
我最近在设置 SolusOS 时遇到同样的问题 and also posted about it here。对我来说,OpenBLAS 库似乎是罪魁祸首,即 libopenblas_haswellp-r0.3.2.so。一旦将库更改为另一个库,就我而言 libopenblas_core2p-r0.3.2.so,我开始在我的 SolusOS 设置上获得正确的结果。
(实际上我只是编辑了自己的 post 以包含此信息)
我注意到 lm()
的奇怪行为,更具体地说 t.values 没有解决。这种行为只能在我的机器上观察到,不考虑全局环境中加载的 packages/objects。
运行 来自 help(t.test)
的示例:
t.test(extra ~ group, data = sleep, var.equal = TRUE)
产生以下结果:
##
## Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.363874 0.203874
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
虽然 "same" 是 lm()
:
summary(lm(extra ~ group, data = sleep))
产量:
##
## Call:
## lm(formula = extra ~ group, data = sleep)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.0095 -0.1152 1.3117 3.4194 11.4571
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.488 2.028 -0.734 0.4725
## group2 4.962 2.147 2.311 0.0329 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.823 on 18 degrees of freedom
## Multiple R-squared: 0.3699, Adjusted R-squared: 0.3349
## F-statistic: 10.57 on 1 and 18 DF, p-value: 0.00444
t.test t.value: -1.8608 vs. lm t.value: 2.311 什么是这种描述的可能原因?
此代码在 Rmarkdown 中 运行(因此在新的 session 中),事先没有任何其他代码 运行。
Session 信息
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-solus-linux-gnu (64-bit)
## Running under: Solus 4.0 Fortitude
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/haswell/libopenblas_haswellp-r0.3.2.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.6.1 magrittr_1.5 tools_3.6.1 htmltools_0.4.0
## [5] yaml_2.2.0 Rcpp_1.0.3 stringi_1.4.3 rmarkdown_1.17
## [9] knitr_1.25 stringr_1.4.0 xfun_0.10 digest_0.6.23
## [13] rlang_0.4.2 evaluate_0.14
其他"Machine"
我在 docker 容器(rocker/verse:3.6.1,与我的机器 R-Version 相同)中尝试了同样的事情,结果是 consistent 结果:
t.test(extra ~ group, data = sleep, var.equal = TRUE)
##
## Two Sample t-test
##
## data: extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.363874 0.203874
## sample estimates:
## mean in group 1 mean in group 2
## 0.75 2.33
summary(lm(extra ~ group, data = sleep))
##
## Call:
## lm(formula = extra ~ group, data = sleep)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.430 -1.305 -0.580 1.455 3.170
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.7500 0.6004 1.249 0.2276
## group2 1.5800 0.8491 1.861 0.0792 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.899 on 18 degrees of freedom
## Multiple R-squared: 0.1613, Adjusted R-squared: 0.1147
## F-statistic: 3.463 on 1 and 18 DF, p-value: 0.07919
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 9 (stretch)
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.6.1 magrittr_1.5 tools_3.6.1 htmltools_0.4.0
## [5] yaml_2.2.0 Rcpp_1.0.3 stringi_1.4.3 rmarkdown_1.16
## [9] knitr_1.25 stringr_1.4.0 xfun_0.10 digest_0.6.22
## [13] rlang_0.4.1 evaluate_0.14
据我所知,唯一的区别是 BLAS/LAPACK 版本。
我最近在设置 SolusOS 时遇到同样的问题 and also posted about it here。对我来说,OpenBLAS 库似乎是罪魁祸首,即 libopenblas_haswellp-r0.3.2.so。一旦将库更改为另一个库,就我而言 libopenblas_core2p-r0.3.2.so,我开始在我的 SolusOS 设置上获得正确的结果。
(实际上我只是编辑了自己的 post 以包含此信息)