什么是 LDA(线性判别分析)的正确实施?

What is correct implementation of LDA (Linear Discriminant Analysis)?

我发现OpenCV中LDA的结果和其他库不一样。例如,输入数据是

DATA (13 data samples with 4 dimensions)
  7    26     6    60
 1    29    15    52
11    56     8    20
11    31     8    47
 7    52     6    33
11    55     9    22
 3    71    17     6
 1    31    22    44
 2    54    18    22
21    47     4    26
 1    40    23    34
11    66     9    12
10    68     8    12

LABEL
 0     1     2     0     1     2     0     1     2     0     1     2     0

OpenCV代码为

Mat data = (Mat_<float>(13, 4) <<\
        7, 26, 6, 60,\
        1, 29, 15, 52,\
        11, 56, 8, 20,\
        11, 31, 8, 47,\
        7, 52, 6, 33,\
        11, 55, 9, 22,\
        3, 71, 17, 6,\
        1, 31, 22, 44,\
        2, 54, 18, 22,\
        21, 47, 4, 26,\
        1, 40, 23, 34,\
        11, 66, 9, 12,\
        10, 68, 8, 12);

Mat mean;
reduce(data, mean, 0, CV_REDUCE_AVG);
mean.convertTo(mean, CV_64F);

Mat label(data.rows, 1, CV_32SC1);
for (int i=0; i<label.rows; i++)
    label.at<int>(i) = i%3;

LDA lda(data, label);
Mat projection = lda.subspaceProject(lda.eigenvectors(), mean, data);

matlab代码为(使用Matlab Toolbox for Dimensionality Reduction

cd drtoolbox\techniques\
load hald
label=[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0]
[projection, trainedlda] = lda(ingredients, label)

特征值为

OpenCV (lda.eigenvectors())
0.4457    4.0132
0.4880    3.5703
0.5448    3.3466
0.5162    3.5794

Matlab Toolbox for Dimensionality Reduction (trainedlda.M)
0.5613    0.7159
0.6257    0.6203
0.6898    0.5884
0.6635    0.6262

那么数据的投影是

OpenCV
1.3261    7.1276
0.8892   -4.7569
-1.8092   -6.1947
-0.0720    1.1927
0.0768    3.3105
-0.7200    0.7405
-0.3788   -4.7388
1.5490   -2.8255
-0.3166   -8.8295
-0.8259    9.8953
1.3239   -3.1406
-0.5140    4.2194
-0.5285    4.0001

Matlab Toolbox for Dimensionality Reduction
1.8030    1.3171
1.2128   -0.8311
-2.3390   -1.0790
-0.0686    0.3192
0.1583    0.5392
-0.9479    0.1414
-0.5238   -0.9722
1.9852   -0.4809
-0.4173   -1.6266
-1.1358    1.9009
1.6719   -0.5711
-0.6996    0.7034
-0.6993    0.6397

即使这些 LDA 具有相同的数据,特征向量和投影也不同。我相信有两种可能。

  1. 其中一个库是错误的。
  2. 我做错了。

谢谢!

不同之处在于特征向量未归一化。 归一化(L2范数)特征向量是

OpenCV
0.44569   0.55196
0.48798   0.49105
0.54478   0.46028
0.51618   0.49230

Matlab Toolbox for Dimensionality Reduction
0.44064   0.55977
0.49120   0.48502
0.54152   0.46008
0.52087   0.48963

它们现在看起来很相似,尽管它们具有完全不同的特征值。

即使 OpenCV 中的 PCA returns 归一化特征向量,LDA 也不会。我的下一个问题是 'Is normalizing eigenvectors in LDA not necessary?'