使用 OpenIMAJ 创建 Fisher 向量

Creating fisher vectors using OpenIMAJ

我正在尝试使用 Fisher 向量对图像进行分类,如下所述:Sánchez, J.、Perronnin, F.、Mensink, T. 和 Verbeek, J. (2013)。使用 Fisher 向量进行图像分类:理论与实践。国际计算机视觉杂志,105(3),222-245。 http://doi.org/10.1007/s11263-013-0636-x

为了尝试和评估这种方法,我想使用 OpenIMAJ 库,因为根据他们的 JavaDoc 他们正是使用这种方法来创建渔夫矢量,但我无法让它工作。我尝试使用 OpenIMAJ 和 OpenCV 创建 SIFT 特征向量,但对于两者我都得到了相同的错误:EM algorithm was never able to comput a valid likelihood given initial parameters。尝试不同的初始参数(或增加 n_init)或检查退化数据。

如果有人已经使用过这种方法,我将不胜感激任何帮助。我创建了一个小示例来说明问题:

    // load an image
    LocalFeatureList<Keypoint> findFeatures = new DoGSIFTEngine()
            .findFeatures(ImageUtilities
                    .readMBF(
                            new URL(
                                    "http://upload.wikimedia.org/wikipedia/en/2/24/Lenna.png"))
                    .flatten());

    // convert to double array
    double[][] data = new double[findFeatures.size()][findFeatures.get(0)
            .getDimensions()];
    for (int i = 0; i < findFeatures.size(); i++) {
        data[i] = findFeatures.get(i).getFeatureVector().asDoubleVector();
    }

    GaussianMixtureModelEM gaussianMixtureModelEM = new GaussianMixtureModelEM(
            64, CovarianceType.Diagonal);

    // error is thrown here
    MixtureOfGaussians estimate = gaussianMixtureModelEM.estimate(data);

我猜问题是您没有使用足够的数据。要训​​练 GMM,您应该使用整个训练语料库中的许多样本,而不是单个图像。另请注意,在学习 GMM 之前,您还应该应用 PCA 来降低特征的维度(这不是严格要求的,但确实有助于提高性能,如您链接的论文所示)。

一旦完成,您就可以使用 OpenIMAJ FisherVector class 从每个图像的 SIFT 点实际计算向量。

顺便说一句 - 当您进行 class化时,如果您想要任何体面的性能,您几乎肯定希望在 DoG-SIFT 上使用 DenseSIFT 变体。

下面是从 UKBench 数据集的前 100 个图像构建 FisherVectors 的示例代码:

    //Load features from disk
    final List<MemoryLocalFeatureList<FloatKeypoint>> data = new ArrayList<MemoryLocalFeatureList<FloatKeypoint>>();
    final List<FloatKeypoint> allKeys = new ArrayList<FloatKeypoint>();

    for (int i = 0; i < 100; i++) {
        final MemoryLocalFeatureList<FloatKeypoint> tmp = FloatKeypoint.convert(MemoryLocalFeatureList.read(
                new File(String.format("/Users/jsh2/Data/ukbench/sift/ukbench%05d.jpg", i)), Keypoint.class));
        data.add(tmp);
        allKeys.addAll(tmp);
    }

    //randomise their order
    Collections.shuffle(allKeys);

    //sample 1000 of them to learn the PCA basis with 64 dims
    final double[][] sample128 = new double[1000][];
    for (int i = 0; i < sample128.length; i++) {
        sample128[i] = ArrayUtils.convertToDouble(allKeys.get(i).vector);
    }

    System.out.println("Performing PCA " + sample128.length);
    final ThinSvdPrincipalComponentAnalysis pca = new ThinSvdPrincipalComponentAnalysis(64);
    pca.learnBasis(sample128);

    //project the 1000 training features by the basis (for computing the GMM)
    final double[][] sample64 = pca.project(new Matrix(sample128)).getArray();

    //project all the features by the basis, reducing their dimensionality
    System.out.println("Projecting features");
    for (final MemoryLocalFeatureList<FloatKeypoint> kpl : data) {
        for (final FloatKeypoint kp : kpl) {
            kp.vector = ArrayUtils.convertToFloat(pca.project(ArrayUtils.convertToDouble(kp.vector)));
        }
    }

    //Learn the GMM with 128 components
    System.out.println("Learning GMM " + sample64.length);
    final GaussianMixtureModelEM gmmem = new GaussianMixtureModelEM(128, CovarianceType.Diagonal);
    final MixtureOfGaussians gmm = gmmem.estimate(sample64);

    //build the fisher vector representations
    final FisherVector<float[]> fisher = new FisherVector<float[]>(gmm, true, true);

    int i = 0;
    final double[][] fvs = new double[100][];
    for (final MemoryLocalFeatureList<FloatKeypoint> kpl : data) {
        fvs[i++] = fisher.aggregate(kpl).asDoubleVector();
    }