定义代码的 X 值

Question

我的任务是在 matlab 上创建一个类似于 normcdf 的脚本。

       x=linspace(-5,5,1000); %values for x
       p= 1/sqrt(2*pi) * exp((-x.^2)/2); % THE PDF for the standard normal 
       t=cumtrapz(x,p); % the CDF for the standard normal distribution
       plot(x,t); %shows the graph of the CDF

问题是当 t 值以递增方式分配给 1:1000 而不是 -5:5 时。我想知道如何将正确的 x 值（即 -5:5,1000）分配给 t 值输出？例如，当我执行 t(n) 时，我得到与 normcdf(n).

相同的结果

澄清一下：问题是我不能像在 normcdf(1) 中那样简单地说 t(-5) 并得到结果 =1，因为 cumtrapz 计算值被分配给 x=1:1000而不是 -5 到 5。

Answer 1

尝试在调用 cumtrapz 时将 x 替换为 0.01。您可以为 cumtrapz (https://www.mathworks.com/help/matlab/ref/cumtrapz.html) 使用矢量或标量间距，这可能会解决您的问题。另外，您是否检查过原始 x 值？是 linspace 的问题（即您没有得到正确的 x 向量）还是 cumtrapz？

Answer 2

更新答案

好的，已阅读您的评论；以下是如何做你想做的事：

x = linspace(-5,5,1000);
p = 1/sqrt(2*pi) * exp((-x.^2)/2);
cdf = cumtrapz(x,p);

q = 3; % Query point
disp(normcdf(q)) % For reference
[~,I] = min(abs(x-q)); % Find closest index
disp(cdf(I)) % Show the value

遗憾的是，没有一种 matlab 语法可以在一行中很好地完成这项工作，但是如果您将找到最接近的索引抽象为一个不同的函数，您可以这样做：

cdf(findClosest(x,q))

function I = findClosest(x,q)
if q>max(x) || q<min(x)
    warning('q outside the range of x');
end
[~,I] = min(abs(x-q));
end

还有；如果你确定查询点 q 的确切值存在于 x 中，你可以这样做

cdf(x==q);

但要注意浮点错误。您可能认为某个范围应该包含某个值，但您几乎不知道它因微小的舍入误差而有所不同。例如，您可以在此处看到实际效果：

x1 = linspace(0,1,1000); % Range
x2 = asin(sin(x1)); % Ought to be the same thing
plot((x1-x2)/eps); grid on; % But they differ by rougly 1 unit of machine precision

旧答案

据我所知，运行你的代码确实重现了 normcdf(x) 的结果......如果你想完全什么他们使用 normcdf erfc。

close all; clear; clc;

x = linspace(-5,5,1000);
cdf = normcdf(x); % Result of normcdf for comparison

%% 1 Trapezoidal integration of normal pd
p = 1/sqrt(2*pi) * exp((-x.^2)/2);
cdf1 = cumtrapz(x,p);

%% 2 But error function IS the integral of the normal pd
cdf2 = (1+erf(x/sqrt(2)))/2;

%% 3 Or, even better, use the error function complement (works better for large negative x)
cdf3 = erfc(-x/sqrt(2))/2;

fprintf('1: Mean error = %.2d\n',mean(abs(cdf1-cdf)));
fprintf('2: Mean error = %.2d\n',mean(abs(cdf2-cdf)));
fprintf('3: Mean error = %.2d\n',mean(abs(cdf3-cdf)));
plot(x,cdf1,x,cdf2,x,cdf3,x,cdf,'k--');

这给了我

1: Mean error = 7.83e-07
2: Mean error = 1.41e-17
3: Mean error = 00 <- Because that is literally what normcdf is doing

如果您的目标不是不使用预定义的 matlab 函数，而是以数字方式计算结果（即计算误差函数），那么这是一个有趣的挑战，您可以阅读 here or in this stats stackexchange post。举个例子，下面的一段代码通过实现 eq 来计算误差函数。 2 形成第一个 link:

nerf = @(x,n) (-1)^n*2/sqrt(pi)*x.^(2*n+1)./factorial(n)/(2*n+1);

figure(1); hold on;
temp = zeros(size(x)); p =[];
for n = 0:20
    temp = temp + nerf(x/sqrt(2),n);
    if~mod(n,3)
        p(end+1) = plot(x,(1+temp)/2);
    end
end
ylim([-1,2]);
title('\Sigma_{n=0}^{inf}  ( 2/sqrt(pi) ) \times ( (-1)^n x^{2*n+1} ) \div ( n! (2*n+1) )');
p(end+1) = plot(x,cdf,'k--');
legend(p,'n = 0','\Sigma_{n} 0->3','\Sigma_{n} 0->6','\Sigma_{n} 0->9',...
    '\Sigma_{n} 0->12','\Sigma_{n} 0->15','\Sigma_{n} 0->18','normcdf(x)',...
    'location','southeast');
grid on; box on;
xlabel('x'); ylabel('norm. cdf approximations');

Answer 3

suggests a way to find the nearest sample point. It is easier, IMO, to interpolate。鉴于问题中定义的 x 和 t，

interp1(x,t,n)

returns x==n 处的 CDF 估计值，对于 n 的任何值。但请注意，对于超出计算范围的值，它将进行推断并产生不可靠的值。

您可以定义一个匿名函数，其工作方式类似于 normcdf:

my_normcdf = @(n)interp1(x,t,n);

my_normcdf(-5)

定义代码的 X 值

defining the X values for a code

statistics

matlab

probability

cdf

更新答案

旧答案