plsregress - 谁能解释特征的规范化？

Question

在执行 LDA 之前，我在 matlab 中使用 plsregress 作为特征缩减方法。我正在尝试交叉验证我的方法，但是在复制他们的 "data processing" 阶段时遇到了一些麻烦。

plsregress uses the SIMPLS algorithm, first centering X and Y by subtracting off column means to get centered variables X0 and Y0. However, it does not rescale the columns. To perform PLS with standardized variables, use zscore to normalize X and Y.

为了尝试在我的 "test" 集合上复制它，我做了以下操作：

test = test - repmat(mean(test), DIM(1), 1);

test = Xloadings\test';
test = test';

出于某种原因，这不太奏效，当应用于训练集时，我没有达到相同的 Xscores。

有谁能解释我是否遗漏了某个步骤，或者我做错了什么？

编辑：换句话说，我如何将 PLS 生成的模型应用到另一个数据集？

Answer 1

我认为你需要使用训练集的均值，而不是测试集。这也适用于 sigma 的标准化。

使用 [Z,mu,sigma] = zscore(train)。应用 mu 和 sigma 进行测试。

plsregress - 谁能解释特征的规范化？

plsregress - can anyone explain the normalisation of features?

matlab

machine-learning

cross-validation