为什么小值的逆 t 分布在 Matlab 和 R 中不同?

Why do the inverse t-distributions for small values differ in Matlab and R?

我想在 Matlab 中评估小值的反学生 t-分布函数,例如 1e-18。自由度为2.

不幸的是,Matlab returns NaN:

tinv(1e-18,2)
NaN

但是,如果我使用 R 的内置函数:

qt(1e-18,2)
-707106781

结果是合理的。为什么 Matlab 不能计算这个小值的函数? Matlab 和 R 的结果与 1e-15 非常相似,但对于较小的值,差异相当大:

tinv(1e-16,2)/qt(1e-16,2) = 1.05

有谁知道 Matlab 和 R 的实现算法有什么区别,如果 R 给出正确的结果,我如何有效地计算逆 t-分布,在Matlab,对于较小的值?

看来是 R 的 qt may use a completely different algorithm than Matlab's tinv. I think that you and others should report this deficiency to The MathWorks by filing a service request。顺便说一句,在 R2014b 和 R2015a 中,-Inf 是 returned 而不是 NaN 对于第一个参数的小值(大约 eps/8 和更少),p.这样比较明智,但我觉得他们应该做得更好。

在此期间,有几种解决方法。

特例
首先,在 Student's t-distribution, there are several simple analytic solutions to the inverse CDF or quantile function 的情况下,对于 ν 的某些整数参数。对于 ν = 2:

的示例
% for v = 2
p = 1e-18;
x = (2*p-1)./sqrt(2*p.*(1-p))

其中 returns -7.071067811865475e+08。至少,Matlab 的 tinv 应该包括这些特殊情况(它们只对 ν = 1 这样做)。它也可能会提高这些特定解决方案的准确性和速度。

数值逆
tinv function is based on the betaincinv function. It appears that it may be this function that is responsible for the loss of precision for small values of the first argument, p. However, as suggested by the OP, one can use the CDF function, tcdf, and root-finding methods to evaluate the inverse CDF numerically. The tcdf function is based on betainc, which doesn't appear to be as sensitive. Using fzero:

p = 1e-18;
v = 2
x = fzero(@(x)tcdf(x,v)-p, 0)

这个returns -7.071067811865468e+08。请注意,对于接近 1.

p 值,此方法不是很稳健

符号解
对于更一般的情况,您可以利用 symbolic math and variable precision arithmetic. You can use identities in terms of Gausian hypergeometric functions, 2F1, as given here for the CDF. Thus, using solve and hypergeom:

% Supposedly valid for or x^2 < v, but appears to work for your example
p = sym('1e-18');
v = sym(2);
syms x
F = 0.5+x*gamma((v+1)/2)*hypergeom([0.5 (v+1)/2],1.5,-x^2/v)/(sqrt(sym('pi')*v)*gamma(v/2));
sol_x = solve(p==F,x);
vpa(sol_x)

可以使用tinv function is based on the betaincinv function. There is no equivalent function or even an incomplete Beta function in the Symbolic Math toolbox or MuPAD, but a similar 2F1 relation for the incomplete Beta function

p = sym('1e-18');
v = sym(2);
syms x
a = v/2;
F = 1-x^a*hypergeom([a 0.5],a+1,x)/(a*beta(a,0.5));
sol_x = solve(2*abs(p-0.5)==F,x);
sol_x = sign(p-0.5).*sqrt(v.*(1-sol_x)./sol_x);
vpa(sol_x)

两个符号方案 return 结果同意 -707106781.186547523340184 使用默认值 digits

我还没有完全验证上面的两个符号方法,所以我不能保证它们在所有情况下都是正确的。该代码还需要矢量化,并且比完全数值解决方案慢。