拟合经验 CDF 曲线以找到精确值
Fitting an empirical CDF curve to find exact vaue
我正在尝试使用经验 cdf 找到任何数字的精确值。获得确切值的最佳方法是什么?我可以使用拟合工具,然后使用拟合函数进行估计吗?
[f,x] = ecdf(samples);
即如何找到适合我的经验 CDF 的最佳函数以获得我想要的任何数字的精确 CDF?
这些是我的样本:
您可以通过找到在最小二乘意义上最适合曲线的形状 (σ) 和位置 (μ) 参数来获得 f(x) 的近似值。
这是 "example" 一组具有正态分布的噪声 "test data"(类似于您的采样数据):
>> % ytest = f(xtest, mutest, sigtest) % sample test data
>> xtest = linspace(-10, 10, 100); % independent variable linearly spaced
>> mutest = rand(1, 1) - 0.5; % random location parameter
>> sigtest = 1 + rand(1, 1); % random shape parameter
>> ytest = normcdf(xtest, mutest, sigtest) + rand(1, 100) / 10; % distribution
mutest =
0.2803
sigtest =
1.6518
现在您可以使用 fminsearch
假设正态分布来查找形状和位置参数。我们需要提供一个 objective 函数,我们希望 fminsearch
将其最小化,因此我们创建了一个匿名函数,它是理想正态累积分布函数与测试数据之间的残差范数。该函数具有二维 [μ, σ],我们将其作为向量传递。我们还需要提供 fminsearch
一个初始猜测。
>> % objective function with normal distribution
>> % mu(1) = location parameter (mean)
>> % mu(2) = shape parameter (standard deviation)
>> obj_func = @(mu)norm(normcdf(xtest, mu(1), mu(2)) - ytest)
>> mu0 = [0, 1]; % initial guesses for mean and stdev
>> mu = fminsearch(obj_func, mu0);
>> sigma = mu(2); % best fit standard deviation
>> mu = mu(1) % best fit mean
mu =
-0.0386
sigma
1.7399
现在您可以使用 normcdf
函数
使用 x、μ 和 σ 预测经验数据中的任何 CDF
>> y = normcdf(xtest, mu, sigma);
MATLAB offers many types of probability distributions. If you don't know what type of distribution your data has, and your population has only positive values, then one possible PDF is a Weibull, which has a flexible 3 parameter form: shape, scale, and location. See "Estimate parameters of 3-parameter Weibull" on MATLAB. Then just replace normcdf
with wblcdf
.
>> xtest = linspace(0, 10, 100);
>> mutest = rand(1, 1) - 0.5; % location
>> mutest
mutest = -0.35813
>> sigtest = 1 + rand(1, 2); % shape and scale
>> sigtest
sigtest =
1.6441 1.3324
>> ytest = wblcdf(xtest-mutest, sigtest(1), sigtest(2)) + rand(1, 100) / 10;
>> % objective function with Weibull distribution
>> % mu(1) = location parameter (mean)
>> % mu(2) = scale parameter (standard deviation)
>> % mu(3) = shape parameter
>> obj_func = @(mu)norm(wblcdf(xtest-mu(1), mu(2), mu(3)) - ytest)
>> mu0 = [0, 1, 1]; % initial guesses for mean and stdev
>> mu = fminsearch(obj_func, mu0);
>> mu
mu =
-0.85695 1.94229 1.89319
>> shape = mu(3); % best fit shape
>> sigma = mu(2); % best fit standard deviation
>> mu = mu(1) % best fit mean
>> y = wblcdf(xtest-mu, sigma, shape);
我正在尝试使用经验 cdf 找到任何数字的精确值。获得确切值的最佳方法是什么?我可以使用拟合工具,然后使用拟合函数进行估计吗?
[f,x] = ecdf(samples);
即如何找到适合我的经验 CDF 的最佳函数以获得我想要的任何数字的精确 CDF?
这些是我的样本:
您可以通过找到在最小二乘意义上最适合曲线的形状 (σ) 和位置 (μ) 参数来获得 f(x) 的近似值。
这是 "example" 一组具有正态分布的噪声 "test data"(类似于您的采样数据):
>> % ytest = f(xtest, mutest, sigtest) % sample test data
>> xtest = linspace(-10, 10, 100); % independent variable linearly spaced
>> mutest = rand(1, 1) - 0.5; % random location parameter
>> sigtest = 1 + rand(1, 1); % random shape parameter
>> ytest = normcdf(xtest, mutest, sigtest) + rand(1, 100) / 10; % distribution
mutest =
0.2803
sigtest =
1.6518
现在您可以使用 fminsearch
假设正态分布来查找形状和位置参数。我们需要提供一个 objective 函数,我们希望 fminsearch
将其最小化,因此我们创建了一个匿名函数,它是理想正态累积分布函数与测试数据之间的残差范数。该函数具有二维 [μ, σ],我们将其作为向量传递。我们还需要提供 fminsearch
一个初始猜测。
>> % objective function with normal distribution
>> % mu(1) = location parameter (mean)
>> % mu(2) = shape parameter (standard deviation)
>> obj_func = @(mu)norm(normcdf(xtest, mu(1), mu(2)) - ytest)
>> mu0 = [0, 1]; % initial guesses for mean and stdev
>> mu = fminsearch(obj_func, mu0);
>> sigma = mu(2); % best fit standard deviation
>> mu = mu(1) % best fit mean
mu =
-0.0386
sigma
1.7399
现在您可以使用 normcdf
函数
>> y = normcdf(xtest, mu, sigma);
MATLAB offers many types of probability distributions. If you don't know what type of distribution your data has, and your population has only positive values, then one possible PDF is a Weibull, which has a flexible 3 parameter form: shape, scale, and location. See "Estimate parameters of 3-parameter Weibull" on MATLAB. Then just replace normcdf
with wblcdf
.
>> xtest = linspace(0, 10, 100);
>> mutest = rand(1, 1) - 0.5; % location
>> mutest
mutest = -0.35813
>> sigtest = 1 + rand(1, 2); % shape and scale
>> sigtest
sigtest =
1.6441 1.3324
>> ytest = wblcdf(xtest-mutest, sigtest(1), sigtest(2)) + rand(1, 100) / 10;
>> % objective function with Weibull distribution
>> % mu(1) = location parameter (mean)
>> % mu(2) = scale parameter (standard deviation)
>> % mu(3) = shape parameter
>> obj_func = @(mu)norm(wblcdf(xtest-mu(1), mu(2), mu(3)) - ytest)
>> mu0 = [0, 1, 1]; % initial guesses for mean and stdev
>> mu = fminsearch(obj_func, mu0);
>> mu
mu =
-0.85695 1.94229 1.89319
>> shape = mu(3); % best fit shape
>> sigma = mu(2); % best fit standard deviation
>> mu = mu(1) % best fit mean
>> y = wblcdf(xtest-mu, sigma, shape);