从尾部的 qnorm 获取高精度值

Question

问题

我正在寻找尾部正态分布的高精度值 (1e-10 and 1 - 1e-10)，因为我使用的 R 包将超出此范围的任何数字设置为这些值，然后调用 qnorm 和 qt 函数。

我注意到 R 中的 qnorm 实现在查看尾部时并不对称。这让我感到非常惊讶，因为众所周知这种分布是对称的，而且我已经看到其他语言的实现是对称的。我检查了 qt 函数，它的尾部也不对称。

以下是 qnorm 函数的结果：

x       qnorm(x)                qnorm(1-x)              qnorm(1-x) + qnorm(x)
1e-2    -2.3263478740408408     2.3263478740408408      0.0 (i.e < machine epsilon)
1e-3    -3.0902323061678132     3.0902323061678132      0.0 (i.e < machine epsilon)
1e-4    -3.71901648545568       3.7190164854557084      2.8421709430404007e-14
1e-5    -4.2648907939228256     4.2648907939238399      1.014299755297543e-12
1e-10   -6.3613409024040557     6.3613408896974208      -1.2706634855419452e-08

很明显，在 x 的值接近 0 或 1 时，此函数失效。是的，在“正常”使用中这不是问题，但我正在研究边缘情况并将小概率乘以非常大的值，在这种情况下错误 (1e-08) 变成一个大值。

注意：我已经尝试使用 1-x 并输入实际数字 0.00001 和 0.99999，但准确性问题仍然存在。

问题

首先，这是 qnorm 和 qt 实现的已知问题吗？我在文档中找不到任何内容，该算法应该是 10^-314 中 p 值的准确 16 位数字，如 Algorithm AS 241 论文中所述。

引自 R 文档：

Wichura, M. J. (1988) Algorithm AS 241: The percentage points of the normal distribution. Applied Statistics, 37, 477–484.

which provides precise results up to about 16 digits.

如果R代码实现了7位版本，为什么它声称是16位？还是它“准确”但原始算法不对称且错误？

如果 R 确实实现了 Algorithm AS 241 的两个版本，我可以打开 16 位版本吗？

或者，R 中是否有更准确的 qnorm 版本？或者，我的问题的另一种解决方案是我需要高精度的分位数函数尾部。

R 版

>version 
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          3.2                         
year           2016                        
month          10                          
day            31                          
svn rev        71607                       
language       R                           
version.string R version 3.3.2 (2016-10-31)
nickname       Sincere Pumpkin Patch

Answer 1

事实证明（正如 Spencer Graves 在 his response 中对 R-devel list-serve 上的同一个问题所指出的那样）qnorm() 确实事实上执行广告。只是，要在分布的上尾获得高度准确的结果，您需要利用函数的 lower.tail 参数。

操作方法如下：

options(digits=22)

## For values of p in [0, 0.5], specify lower tail probabilities 
qnorm(p = 1e-10)                      ## x: P(X <= x) == 1e-10
# [1] -6.3613409024040557

## For values of p in (0.5, 1], specify upper tail probabilities
qnorm(p = 1e-10, lower.tail=FALSE)    ## x: P(X > x)  == 1e-10     (correct approach)
# [1] 6.3613409024040557
qnorm(p = 1 - 1e-10)                  ## x: P(X <= x) == 1-(1e-1)  (incorrect approach)
# [1] 6.3613408896974208

问题是 1-1e-10（例如）受浮点舍入误差的影响，因此它与 1（区间的上端）的距离并不相同因为 1e-10 来自 0（区间的下端）。当以更熟悉的形式出现时，潜在的问题（它是 R-FAQ 7.31！）变得很明显：

1 - (1 - 1e-10) == 1e-10
## [1] FALSE

最后，这里有一个快速确认，即 qnorm() 提供了与其帮助文件中声明的值一致的准确（或至少对称的）结果：

qnorm(1e-314)
## [1] -37.906647423565666
qnorm(1e-314, lower.tail=FALSE)
## [1] 37.906647423565666

## With this failing in just the way (and for just the reason) you'd now expect
qnorm(1-1e-314)
# [1] Inf
1 == (1-1e-314)
# [1] TRUE

从尾部的 qnorm 获取高精度值

Getting high precision values from qnorm in the tail

r

normal-distribution

问题

问题

R 版