高斯混合模型的 EM 算法实现
Implementation of EM algorithm for Gaussian Mixture Models
我想使用 EM 算法在给定数据集上训练具有四个分量的高斯混合模型。该集合是三维的,包含 300 个样本。
问题是在 EM 算法大约 6 轮之后,协方差矩阵 sigma 根据 matlab 变得接近奇异(rank(sigma) = 2
而不是 3)。这反过来会导致不希望的结果,例如评估高斯分布的复杂值 gm(k,i)
.
此外,我使用 gaussian 的日志来解决下溢问题 - 请参阅 E-step。我不确定这是否正确,我是否必须在其他地方承担责任 p(w_k | x^(i), theta) 的 exp?
你能告诉我到目前为止我对 EM 算法的实现是否正确吗?
以及如何解释接近奇异协方差 sigma 的问题?
下面是我对 EM 算法的实现:
首先我初始化使用kmeans的分量的均值和协方差:
load('data1.mat');
X = Data'; % 300x3 data set
D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components
% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi
[idx,mu] = kmeans(X,K); % initial means of the components
% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
sigma(:,:,k) = cov(X(idx==k,:));
end
对于E-step,我使用以下公式来计算责任。
w_k是k个高斯分量。
x^(i) 是单个数据点(样本)
theta代表高斯混合模型的参数:mu,Sigma,pi。
对应代码如下:
% variables for convergence
converged = 0;
prevLoglikelihood = Inf;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
round = 0;
while (converged ~= 1)
round = round +1
gm = zeros(K,N); % gaussian component in the nominator
sumGM = zeros(N,1); % denominator of responsibilities
% E-step: Evaluate the responsibilities using the current parameters
% compute the nominator and denominator of the responsibilities
for k = 1:K
for i = 1:N
Xmu = X-mu;
% I am using log to prevent underflow of the gaussian distribution (exp("small value"))
logPdf = log(1/sqrt(det(sigma(:,:,k))*(2*pi)^D)) + (-0.5*Xmu*(sigma(:,:,k)\Xmu'));
gm(k,i) = log(p(k)) * logPdf;
sumGM(i) = sumGM(i) + gm(k,i);
end
end
% calculate responsibilities
res = zeros(K,N); % responsibilities
Nk = zeros(4,1);
for k = 1:K
for i = 1:N
% I tried to use the exp(gm(k,i)/sumGM(i)) to compute res but this leads to sum(pi) > 1.
res(k,i) = gm(k,i)/sumGM(i);
end
Nk(k) = sum(res(k,:));
end
Nk(k)
使用 M-step 中给出的公式计算,并在 M-step 中用于计算新概率 p(k)
.
M步
% M-step: Re-estimate the parameters using the current responsibilities
for k = 1:K
for i = 1:N
mu(k,:) = mu(k,:) + res(k,i).*X(k,:);
sigma(:,:,k) = sigma(:,:,k) + res(k,i).*(X(k,:)-mu(k,:))*(X(k,:)-mu(k,:))';
end
mu(k,:) = mu(k,:)./Nk(k);
sigma(:,:,k) = sigma(:,:,k)./Nk(k);
p(k) = Nk(k)/N;
end
现在为了检查收敛性,使用以下公式计算对数似然:
% Evaluate the log-likelihood and check for convergence of either
% the parameters or the log-likelihood. If not converged, go to E-step.
loglikelihood = 0;
for i = 1:N
loglikelihood = loglikelihood + log(sum(gm(:,i)));
end
% Check for convergence of parameters
errorLoglikelihood = abs(loglikelihood-prevLoglikelihood);
if (errorLoglikelihood <= eps)
converged = 1;
end
errorMu = abs(mu(:)-prevMu(:));
errorSigma = abs(sigma(:)-prevSigma(:));
errorPi = abs(p(:)-prevPi(:));
if (all(errorMu <= eps) && all(errorSigma <= eps) && all(errorPi <= eps))
converged = 1;
end
prevLoglikelihood = loglikelihood;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
end % while
我的高斯混合模型 EM 算法的 Matlab 实现有问题吗?
前世烦恼:
问题是我无法使用对数似然检查收敛性,因为它是 -Inf
。这是在评估责任公式中的高斯时舍入零值的结果(参见 E-step)。
你能告诉我到目前为止我对 EM 算法的实现是否正确吗?
以及如何解决四舍五入的零值问题?
下面是我对 EM 算法的实现:
首先我初始化使用kmeans的分量的均值和协方差:
load('data1.mat');
X = Data'; % 300x3 data set
D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components
% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi
[idx,mu] = kmeans(X,K); % initial means of the components
% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
sigma(:,:,k) = cov(X(idx==k,:));
end
对于 E-step 我使用以下公式来计算责任
对应代码如下:
% variables for convergence
converged = 0;
prevLoglikelihood = Inf;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
round = 0;
while (converged ~= 1)
round = round +1
gm = zeros(K,N); % gaussian component in the nominator -
% some values evaluate to zero
sumGM = zeros(N,1); % denominator of responsibilities
% E-step: Evaluate the responsibilities using the current parameters
% compute the nominator and denominator of the responsibilities
for k = 1:K
for i = 1:N
% HERE values evalute to zero e.g. exp(-746.6228) = -Inf
gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*(X(i,:)-mu(k,:))*inv(sigma(:,:,k))*(X(i,:)-mu(k,:))');
sumGM(i) = sumGM(i) + gm(k,i);
end
end
% calculate responsibilities
res = zeros(K,N); % responsibilities
Nk = zeros(4,1);
for k = 1:K
for i = 1:N
res(k,i) = gm(k,i)/sumGM(i);
end
Nk(k) = sum(res(k,:));
end
Nk(k)
是使用 M 步中给出的公式计算的。
M步
% M-step: Re-estimate the parameters using the current responsibilities
mu = zeros(K,3);
for k = 1:K
for i = 1:N
mu(k,:) = mu(k,:) + res(k,i).*X(k,:);
sigma(:,:,k) = sigma(:,:,k) + res(k,i).*(X(k,:)-mu(k,:))*(X(k,:)-mu(k,:))';
end
mu(k,:) = mu(k,:)./Nk(k);
sigma(:,:,k) = sigma(:,:,k)./Nk(k);
p(k) = Nk(k)/N;
end
现在为了检查收敛性,使用以下公式计算对数似然:
% Evaluate the log-likelihood and check for convergence of either
% the parameters or the log-likelihood. If not converged, go to E-step.
loglikelihood = 0;
for i = 1:N
loglikelihood = loglikelihood + log(sum(gm(:,i)));
end
% Check for convergence of parameters
errorLoglikelihood = abs(loglikelihood-prevLoglikelihood);
if (errorLoglikelihood <= eps)
converged = 1;
end
errorMu = abs(mu(:)-prevMu(:));
errorSigma = abs(sigma(:)-prevSigma(:));
errorPi = abs(p(:)-prevPi(:));
if (all(errorMu <= eps) && all(errorSigma <= eps) && all(errorPi <= eps))
converged = 1;
end
prevLoglikelihood = loglikelihood;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
end % while
第一轮后 loglikelihood
大约是 700。
在第二轮中它是 -Inf
因为 E-step 中的某些 gm(k,i)
值为零。因此对数显然是负无穷大。
零值也导致 sumGM
等于零,因此导致 mu
和 sigma
矩阵中的所有 NaN 条目。
我该如何解决这个问题?
你能告诉我我的实现是否有问题吗?
可以通过某种方式提高 Matlab 的精度来解决吗?
编辑:
我为 gm(k,i) 中的 exp() 项添加了缩放。
不幸的是,这并没有多大帮助。再经过几轮后,我仍然遇到下溢问题。
scale = zeros(N,D);
for i = 1:N
max = 0;
for k = 1:K
Xmu = X(i,:)-mu(k,:);
if (norm(scale(i,:) - Xmu) > max)
max = norm(scale(i,:) - Xmu);
scale(i,:) = Xmu;
end
end
end
for k = 1:K
for i = 1:N
Xmu = X(i,:)-mu(k,:);
% scale gm to prevent underflow
Xmu = Xmu - scale(i,:);
gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*Xmu*inv(sigma(:,:,k))*Xmu');
sumGM(i) = sumGM(i) + gm(k,i);
end
end
此外,我注意到 kmeans 初始化的均值与在 M 步中计算均值的后续轮次完全不同。
均值:
mu = 13.500000000000000 0.026602138870044 0.062415945993735
88.500000000000000 -0.009869960132085 -0.075177888210981
39.000000000000000 -0.042569305020309 0.043402772876513
64.000000000000000 -0.024519281362918 -0.012586980924762
M 步后:
round = 2
mu = 1.000000000000000 0.077230046948357 0.024498886414254
2.000000000000000 0.074260118474053 0.026484346404660
3.000000000000002 0.070944016105476 0.029043085983168
4.000000000000000 0.067613431480832 0.031641849205021
在接下来的回合中 mu
完全没有变化。它与第 2 轮保持相同。
我猜这是因为gm(k,i)中的下溢引起的?
要么我的缩放实现不正确,要么算法的整个实现在某处出错:(
编辑 2
四轮之后,我得到了 NaN
值并更详细地研究了 gm。只看一个样本(没有 0.5 因子),gm
在所有分量中变为零。放入 matlab gm(:,1) = [0 0 0 0]
。这反过来导致 sumGM 等于零 -> NaN 因为我除以零。我在
中提供了更多详细信息
round = 1
mu = 62.0000 -0.0298 -0.0078
37.0000 -0.0396 0.0481
87.5000 -0.0083 -0.0728
12.5000 0.0303 0.0614
gm(:,1) = [11.7488, 0.0000, 0.0000, 0.0000]
round = 2
mu = 1.0000 0.0772 0.0245
2.0000 0.0743 0.0265
3.0000 0.0709 0.0290
4.0000 0.0676 0.0316
gm(:,1) = [0.0000, 0.0000, 0.0000, 0.3128]
round = 3
mu = 1.0000 0.0772 0.0245
2.0000 0.0743 0.0265
3.0000 0.0709 0.0290
4.0000 0.0676 0.0316
gm(:,1) = [0, 0, 0.0000, 0.2867]
round = 4
mu = 1.0000 0.0772 0.0245
NaN NaN NaN
3.0000 0.0709 0.0290
4.0000 0.0676 0.0316
gm(:,1) = 1.0e-105 * [0, NaN, 0, 0.5375]
首先,与 kmeans 的初始化相比,means 似乎没有改变并且完全不同。
并且根据 gm(:,1)
的输出,每个样本(不仅仅是像这里的第一个样本)仅对应于一个高斯分量。每个高斯分量中的样本不应该是 "partially distributed" 吗?
编辑3:
所以我猜 mu 没有改变的问题是 M 步的第一行:mu = zeros(K,3);
.
为了解决下溢问题,我目前正在尝试使用高斯的日志:
function logPdf = logmvnpdf(X, mu, sigma, D)
Xmu = X-mu;
logPdf = log(1/sqrt(det(sigma)*(2*pi)^D)) + (-0.5*Xmu*inv(sigma)*Xmu');
end
新问题是协方差矩阵西格玛。 Matlab 声称:
警告:矩阵接近奇异或严重缩放。结果可能不准确。
6 轮后我得到 gm(高斯分布)的虚数值。
更新后的 E-Step 现在看起来像这样:
gm = zeros(K,N); % gaussian component in the nominator
sumGM = zeros(N,1); % denominator of responsibilities
for k = 1:K
for i = 1:N
%gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*Xmu*inv(sigma(:,:,k))*Xmu');
%gm(k,i) = p(k)*mvnpdf(X(i,:),mu(k,:),sigma(:,:,k));
gm(k,i) = log(p(k)) + logmvnpdf(X(i,:), mu(k,:), sigma(:,:,k), D);
sumGM(i) = sumGM(i) + gm(k,i);
end
end
看起来你应该能够使用比例因子 scale(i) 将 gm(k, i) 带入一个可表示的范围内,因为如果你将 gm(k, i) 乘以 scale(i) 这个最终也会乘以 sumGM(i),并在计算 res(k, i) = gm(k, i) / sumGM(i).
时被取消
理论上我会让 scale(i) = 1 / max_k(exp(-0.5*(X(i,:)-mu(k,:))) ,实际上计算它没有进行取幂,所以你最终会处理它的日志,max_k(-0.5*(X(i,:)-mu(k,:)) - 这给了你一个可以添加到每个的通用术语 - 0.5*(X(i,:)-mu(k,:) 在使用 exp() 之前至少将最大值保持在可表示的范围内 - 在此校正之后仍然下溢到零的任何东西你都不关心,因为它与其他贡献相比微不足道。
我想使用 EM 算法在给定数据集上训练具有四个分量的高斯混合模型。该集合是三维的,包含 300 个样本。
问题是在 EM 算法大约 6 轮之后,协方差矩阵 sigma 根据 matlab 变得接近奇异(rank(sigma) = 2
而不是 3)。这反过来会导致不希望的结果,例如评估高斯分布的复杂值 gm(k,i)
.
此外,我使用 gaussian 的日志来解决下溢问题 - 请参阅 E-step。我不确定这是否正确,我是否必须在其他地方承担责任 p(w_k | x^(i), theta) 的 exp?
你能告诉我到目前为止我对 EM 算法的实现是否正确吗? 以及如何解释接近奇异协方差 sigma 的问题?
下面是我对 EM 算法的实现:
首先我初始化使用kmeans的分量的均值和协方差:
load('data1.mat');
X = Data'; % 300x3 data set
D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components
% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi
[idx,mu] = kmeans(X,K); % initial means of the components
% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
sigma(:,:,k) = cov(X(idx==k,:));
end
对于E-step,我使用以下公式来计算责任。
w_k是k个高斯分量。
x^(i) 是单个数据点(样本)
theta代表高斯混合模型的参数:mu,Sigma,pi。
对应代码如下:
% variables for convergence
converged = 0;
prevLoglikelihood = Inf;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
round = 0;
while (converged ~= 1)
round = round +1
gm = zeros(K,N); % gaussian component in the nominator
sumGM = zeros(N,1); % denominator of responsibilities
% E-step: Evaluate the responsibilities using the current parameters
% compute the nominator and denominator of the responsibilities
for k = 1:K
for i = 1:N
Xmu = X-mu;
% I am using log to prevent underflow of the gaussian distribution (exp("small value"))
logPdf = log(1/sqrt(det(sigma(:,:,k))*(2*pi)^D)) + (-0.5*Xmu*(sigma(:,:,k)\Xmu'));
gm(k,i) = log(p(k)) * logPdf;
sumGM(i) = sumGM(i) + gm(k,i);
end
end
% calculate responsibilities
res = zeros(K,N); % responsibilities
Nk = zeros(4,1);
for k = 1:K
for i = 1:N
% I tried to use the exp(gm(k,i)/sumGM(i)) to compute res but this leads to sum(pi) > 1.
res(k,i) = gm(k,i)/sumGM(i);
end
Nk(k) = sum(res(k,:));
end
Nk(k)
使用 M-step 中给出的公式计算,并在 M-step 中用于计算新概率 p(k)
.
M步
% M-step: Re-estimate the parameters using the current responsibilities
for k = 1:K
for i = 1:N
mu(k,:) = mu(k,:) + res(k,i).*X(k,:);
sigma(:,:,k) = sigma(:,:,k) + res(k,i).*(X(k,:)-mu(k,:))*(X(k,:)-mu(k,:))';
end
mu(k,:) = mu(k,:)./Nk(k);
sigma(:,:,k) = sigma(:,:,k)./Nk(k);
p(k) = Nk(k)/N;
end
现在为了检查收敛性,使用以下公式计算对数似然:
% Evaluate the log-likelihood and check for convergence of either
% the parameters or the log-likelihood. If not converged, go to E-step.
loglikelihood = 0;
for i = 1:N
loglikelihood = loglikelihood + log(sum(gm(:,i)));
end
% Check for convergence of parameters
errorLoglikelihood = abs(loglikelihood-prevLoglikelihood);
if (errorLoglikelihood <= eps)
converged = 1;
end
errorMu = abs(mu(:)-prevMu(:));
errorSigma = abs(sigma(:)-prevSigma(:));
errorPi = abs(p(:)-prevPi(:));
if (all(errorMu <= eps) && all(errorSigma <= eps) && all(errorPi <= eps))
converged = 1;
end
prevLoglikelihood = loglikelihood;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
end % while
我的高斯混合模型 EM 算法的 Matlab 实现有问题吗?
前世烦恼:
问题是我无法使用对数似然检查收敛性,因为它是 -Inf
。这是在评估责任公式中的高斯时舍入零值的结果(参见 E-step)。
你能告诉我到目前为止我对 EM 算法的实现是否正确吗? 以及如何解决四舍五入的零值问题?
下面是我对 EM 算法的实现:
首先我初始化使用kmeans的分量的均值和协方差:
load('data1.mat');
X = Data'; % 300x3 data set
D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components
% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi
[idx,mu] = kmeans(X,K); % initial means of the components
% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
sigma(:,:,k) = cov(X(idx==k,:));
end
对于 E-step 我使用以下公式来计算责任
对应代码如下:
% variables for convergence
converged = 0;
prevLoglikelihood = Inf;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
round = 0;
while (converged ~= 1)
round = round +1
gm = zeros(K,N); % gaussian component in the nominator -
% some values evaluate to zero
sumGM = zeros(N,1); % denominator of responsibilities
% E-step: Evaluate the responsibilities using the current parameters
% compute the nominator and denominator of the responsibilities
for k = 1:K
for i = 1:N
% HERE values evalute to zero e.g. exp(-746.6228) = -Inf
gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*(X(i,:)-mu(k,:))*inv(sigma(:,:,k))*(X(i,:)-mu(k,:))');
sumGM(i) = sumGM(i) + gm(k,i);
end
end
% calculate responsibilities
res = zeros(K,N); % responsibilities
Nk = zeros(4,1);
for k = 1:K
for i = 1:N
res(k,i) = gm(k,i)/sumGM(i);
end
Nk(k) = sum(res(k,:));
end
Nk(k)
是使用 M 步中给出的公式计算的。
M步
% M-step: Re-estimate the parameters using the current responsibilities
mu = zeros(K,3);
for k = 1:K
for i = 1:N
mu(k,:) = mu(k,:) + res(k,i).*X(k,:);
sigma(:,:,k) = sigma(:,:,k) + res(k,i).*(X(k,:)-mu(k,:))*(X(k,:)-mu(k,:))';
end
mu(k,:) = mu(k,:)./Nk(k);
sigma(:,:,k) = sigma(:,:,k)./Nk(k);
p(k) = Nk(k)/N;
end
现在为了检查收敛性,使用以下公式计算对数似然:
% Evaluate the log-likelihood and check for convergence of either
% the parameters or the log-likelihood. If not converged, go to E-step.
loglikelihood = 0;
for i = 1:N
loglikelihood = loglikelihood + log(sum(gm(:,i)));
end
% Check for convergence of parameters
errorLoglikelihood = abs(loglikelihood-prevLoglikelihood);
if (errorLoglikelihood <= eps)
converged = 1;
end
errorMu = abs(mu(:)-prevMu(:));
errorSigma = abs(sigma(:)-prevSigma(:));
errorPi = abs(p(:)-prevPi(:));
if (all(errorMu <= eps) && all(errorSigma <= eps) && all(errorPi <= eps))
converged = 1;
end
prevLoglikelihood = loglikelihood;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
end % while
第一轮后 loglikelihood
大约是 700。
在第二轮中它是 -Inf
因为 E-step 中的某些 gm(k,i)
值为零。因此对数显然是负无穷大。
零值也导致 sumGM
等于零,因此导致 mu
和 sigma
矩阵中的所有 NaN 条目。
我该如何解决这个问题? 你能告诉我我的实现是否有问题吗? 可以通过某种方式提高 Matlab 的精度来解决吗?
编辑:
我为 gm(k,i) 中的 exp() 项添加了缩放。 不幸的是,这并没有多大帮助。再经过几轮后,我仍然遇到下溢问题。
scale = zeros(N,D);
for i = 1:N
max = 0;
for k = 1:K
Xmu = X(i,:)-mu(k,:);
if (norm(scale(i,:) - Xmu) > max)
max = norm(scale(i,:) - Xmu);
scale(i,:) = Xmu;
end
end
end
for k = 1:K
for i = 1:N
Xmu = X(i,:)-mu(k,:);
% scale gm to prevent underflow
Xmu = Xmu - scale(i,:);
gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*Xmu*inv(sigma(:,:,k))*Xmu');
sumGM(i) = sumGM(i) + gm(k,i);
end
end
此外,我注意到 kmeans 初始化的均值与在 M 步中计算均值的后续轮次完全不同。
均值:
mu = 13.500000000000000 0.026602138870044 0.062415945993735
88.500000000000000 -0.009869960132085 -0.075177888210981
39.000000000000000 -0.042569305020309 0.043402772876513
64.000000000000000 -0.024519281362918 -0.012586980924762
M 步后:
round = 2
mu = 1.000000000000000 0.077230046948357 0.024498886414254
2.000000000000000 0.074260118474053 0.026484346404660
3.000000000000002 0.070944016105476 0.029043085983168
4.000000000000000 0.067613431480832 0.031641849205021
在接下来的回合中 mu
完全没有变化。它与第 2 轮保持相同。
我猜这是因为gm(k,i)中的下溢引起的? 要么我的缩放实现不正确,要么算法的整个实现在某处出错:(
编辑 2
四轮之后,我得到了 NaN
值并更详细地研究了 gm。只看一个样本(没有 0.5 因子),gm
在所有分量中变为零。放入 matlab gm(:,1) = [0 0 0 0]
。这反过来导致 sumGM 等于零 -> NaN 因为我除以零。我在
round = 1
mu = 62.0000 -0.0298 -0.0078
37.0000 -0.0396 0.0481
87.5000 -0.0083 -0.0728
12.5000 0.0303 0.0614
gm(:,1) = [11.7488, 0.0000, 0.0000, 0.0000]
round = 2
mu = 1.0000 0.0772 0.0245
2.0000 0.0743 0.0265
3.0000 0.0709 0.0290
4.0000 0.0676 0.0316
gm(:,1) = [0.0000, 0.0000, 0.0000, 0.3128]
round = 3
mu = 1.0000 0.0772 0.0245
2.0000 0.0743 0.0265
3.0000 0.0709 0.0290
4.0000 0.0676 0.0316
gm(:,1) = [0, 0, 0.0000, 0.2867]
round = 4
mu = 1.0000 0.0772 0.0245
NaN NaN NaN
3.0000 0.0709 0.0290
4.0000 0.0676 0.0316
gm(:,1) = 1.0e-105 * [0, NaN, 0, 0.5375]
首先,与 kmeans 的初始化相比,means 似乎没有改变并且完全不同。
并且根据 gm(:,1)
的输出,每个样本(不仅仅是像这里的第一个样本)仅对应于一个高斯分量。每个高斯分量中的样本不应该是 "partially distributed" 吗?
编辑3:
所以我猜 mu 没有改变的问题是 M 步的第一行:mu = zeros(K,3);
.
为了解决下溢问题,我目前正在尝试使用高斯的日志:
function logPdf = logmvnpdf(X, mu, sigma, D)
Xmu = X-mu;
logPdf = log(1/sqrt(det(sigma)*(2*pi)^D)) + (-0.5*Xmu*inv(sigma)*Xmu');
end
新问题是协方差矩阵西格玛。 Matlab 声称: 警告:矩阵接近奇异或严重缩放。结果可能不准确。
6 轮后我得到 gm(高斯分布)的虚数值。
更新后的 E-Step 现在看起来像这样:
gm = zeros(K,N); % gaussian component in the nominator
sumGM = zeros(N,1); % denominator of responsibilities
for k = 1:K
for i = 1:N
%gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*Xmu*inv(sigma(:,:,k))*Xmu');
%gm(k,i) = p(k)*mvnpdf(X(i,:),mu(k,:),sigma(:,:,k));
gm(k,i) = log(p(k)) + logmvnpdf(X(i,:), mu(k,:), sigma(:,:,k), D);
sumGM(i) = sumGM(i) + gm(k,i);
end
end
看起来你应该能够使用比例因子 scale(i) 将 gm(k, i) 带入一个可表示的范围内,因为如果你将 gm(k, i) 乘以 scale(i) 这个最终也会乘以 sumGM(i),并在计算 res(k, i) = gm(k, i) / sumGM(i).
时被取消理论上我会让 scale(i) = 1 / max_k(exp(-0.5*(X(i,:)-mu(k,:))) ,实际上计算它没有进行取幂,所以你最终会处理它的日志,max_k(-0.5*(X(i,:)-mu(k,:)) - 这给了你一个可以添加到每个的通用术语 - 0.5*(X(i,:)-mu(k,:) 在使用 exp() 之前至少将最大值保持在可表示的范围内 - 在此校正之后仍然下溢到零的任何东西你都不关心,因为它与其他贡献相比微不足道。