将生成的数据与测量数据进行比较

Question

我们测量了数据，我们设法确定了它遵循的分布类型 (Gamma) 及其参数 (A,B)

然后我们使用 for 循环

从具有相同参数和相同范围（18.5 到 59 之间）的相同分布生成 n 个样本 (10000)

for i=1:1:10000
tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);
W(i,:) =random(tot,1,1);
end

然后我们尝试使用以下方法拟合生成的数据：

h1=histfit(W);

在此之后，我们尝试绘制 Gamma 曲线以比较同一图形上的两条曲线：

hold on
h2=histfit(W,[],'Gamma');
h2(1).Visible='off';

问题是两条曲线偏移如下图"Figure 1 is the generated data from the previous code and Figure 2 is without truncating the generated data"

enter image description here

有谁知道为什么吗？？

提前致谢

Answer 1

默认情况下 histfit 在直方图上拟合正态概率密度函数 (PDF)。我不确定您实际上想做什么，但您所做的是：

% fit a normal PDF
h1=histfit(W); % this is equal to h1 = histfit(W,[],'normal');

% fit a gamma PDF
h2=histfit(W,[],'Gamma');

显然这会导致不同的拟合，因为普通 PDF != 伽玛 PDF。您唯一看到的是伽马 PDF 更适合曲线，因为您从该分布中采样了数据。

如果要检查数据是否遵循特定分布，也可以使用 KS-test。你的情况

% check if the data follows the distribution speccified in tot
[h p] = kstest(W,'CDF',tot)

如果数据遵循伽马分布。那么 h = 0 且 p > 0.05，否则 h = 1 且 p < 0.05。

现在对您的代码做一些一般性的评论：请查阅preallocation of memory，它会大大加快循环速度。例如

W = zeros(10000,1);
for i=1:1:10000
    tot=makedist('Gamma','A',11.8919,'B',2.9927);
    tot= truncate(tot,18.5,59);
    W(i,:) =random(tot,1,1);
end

此外，

tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);

不依赖于循环索引，因此可以移到循环前面以进一步加快速度。 avoid using i as loop variable.

也是一个好习惯

但实际上您可以跳过整个循环，因为 random() 允许一次 return 多个样本：

tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);
W =random(tot,10000,1);

将生成的数据与测量数据进行比较

comparing generated data to measured data

statistics

matlab

distribution