如何在 Matlab 中的 PCA 之后绘制线性 SVM 的决策边界?
How to plot decision boundary from linear SVM after PCA in Matlab?
我已经在大型数据集上进行了线性 SVM,但是为了减少我执行 PCA 的维数,而不是在组件分数的子集上进行 SVM(前 650 个组件解释了 99.5%的方差)。现在我想使用在 PCA space 中创建的 SVM 的 beta 权重和偏差绘制原始变量 space 中的决策边界。但是我不知道如何将 SVM 中的偏差项投影到原始变量 space 中。我已经使用 fisher iris 数据编写了一个演示来说明:
clear; clc; close all
% load data
load fisheriris
inds = ~strcmp(species,'setosa');
X = meas(inds,3:4);
Y = species(inds);
mu = mean(X)
% perform the PCA
[eigenvectors, scores] = pca(X);
% train the svm
SVMModel = fitcsvm(scores,Y);
% plot the result
figure(1)
gscatter(scores(:,1),scores(:,2),Y,'rgb','osd')
title('PCA space')
% now plot the decision boundary
betas = SVMModel.Beta;
m = -betas(1)/betas(2); % my gradient
b = -SVMModel.Bias; % my y-intercept
f = @(x) m.*x + b; % my linear equation
hold on
fplot(f,'k')
hold off
axis equal
xlim([-1.5 2.5])
ylim([-2 2])
% inverse transform the PCA
Xhat = scores * eigenvectors';
Xhat = bsxfun(@plus, Xhat, mu);
% plot the result
figure(2)
hold on
gscatter(Xhat(:,1),Xhat(:,2),Y,'rgb','osd')
% and the decision boundary
betaHat = betas' * eigenvectors';
mHat = -betaHat(1)/betaHat(2);
bHat = b * eigenvectors';
bHat = bHat + mu; % I know I have to add mu somewhere...
bHat = bHat/betaHat(2);
bHat = sum(sum(bHat)); % sum to reduce the matrix to a single value
% the correct value of bHat should be 6.3962
f = @(x) mHat.*x + bHat;
fplot(f,'k')
hold off
axis equal
title('Recovered feature space')
xlim([3 7])
ylim([0 4])
任何关于我如何错误计算 bHat 的指导将不胜感激。
以防万一其他人遇到这个问题,解决方案是可以使用偏差项来找到 y 轴截距,b = -SVMModel.Bias/betas(2)
。而y轴截距只是space [0 b]
中的另一个点,可以通过PCA对其进行逆变换得到recovered/unrotated。然后可以使用这个新点来求解线性方程 y = mx + b(即 b = y - mx)。所以代码应该是:
% and the decision boundary
betaHat = betas' * eigenvectors';
mHat = -betaHat(1)/betaHat(2);
yint = b/betas(2); % y-intercept in PCA space
yintHat = [0 b] * eigenvectors'; % recover in original space
yintHat = yintHat + mu;
bHat = yintHat(2) - mHat*yintHat(1); % solve the linear equation
% the correct value of bHat is now 6.3962
我已经在大型数据集上进行了线性 SVM,但是为了减少我执行 PCA 的维数,而不是在组件分数的子集上进行 SVM(前 650 个组件解释了 99.5%的方差)。现在我想使用在 PCA space 中创建的 SVM 的 beta 权重和偏差绘制原始变量 space 中的决策边界。但是我不知道如何将 SVM 中的偏差项投影到原始变量 space 中。我已经使用 fisher iris 数据编写了一个演示来说明:
clear; clc; close all
% load data
load fisheriris
inds = ~strcmp(species,'setosa');
X = meas(inds,3:4);
Y = species(inds);
mu = mean(X)
% perform the PCA
[eigenvectors, scores] = pca(X);
% train the svm
SVMModel = fitcsvm(scores,Y);
% plot the result
figure(1)
gscatter(scores(:,1),scores(:,2),Y,'rgb','osd')
title('PCA space')
% now plot the decision boundary
betas = SVMModel.Beta;
m = -betas(1)/betas(2); % my gradient
b = -SVMModel.Bias; % my y-intercept
f = @(x) m.*x + b; % my linear equation
hold on
fplot(f,'k')
hold off
axis equal
xlim([-1.5 2.5])
ylim([-2 2])
% inverse transform the PCA
Xhat = scores * eigenvectors';
Xhat = bsxfun(@plus, Xhat, mu);
% plot the result
figure(2)
hold on
gscatter(Xhat(:,1),Xhat(:,2),Y,'rgb','osd')
% and the decision boundary
betaHat = betas' * eigenvectors';
mHat = -betaHat(1)/betaHat(2);
bHat = b * eigenvectors';
bHat = bHat + mu; % I know I have to add mu somewhere...
bHat = bHat/betaHat(2);
bHat = sum(sum(bHat)); % sum to reduce the matrix to a single value
% the correct value of bHat should be 6.3962
f = @(x) mHat.*x + bHat;
fplot(f,'k')
hold off
axis equal
title('Recovered feature space')
xlim([3 7])
ylim([0 4])
任何关于我如何错误计算 bHat 的指导将不胜感激。
以防万一其他人遇到这个问题,解决方案是可以使用偏差项来找到 y 轴截距,b = -SVMModel.Bias/betas(2)
。而y轴截距只是space [0 b]
中的另一个点,可以通过PCA对其进行逆变换得到recovered/unrotated。然后可以使用这个新点来求解线性方程 y = mx + b(即 b = y - mx)。所以代码应该是:
% and the decision boundary
betaHat = betas' * eigenvectors';
mHat = -betaHat(1)/betaHat(2);
yint = b/betas(2); % y-intercept in PCA space
yintHat = [0 b] * eigenvectors'; % recover in original space
yintHat = yintHat + mu;
bHat = yintHat(2) - mHat*yintHat(1); % solve the linear equation
% the correct value of bHat is now 6.3962