C++/OpenCV:如何使用 BOWImgDescriptorExtractor 确定哪些聚类与词汇表中的哪些图像相关?

C++/OpenCV: How to use BOWImgDescriptorExtractor to determine which clusters relate to which images in the vocabulary?

我的目标是将图像作为查询并在图像库中找到它的最佳匹配。我在 openCV 3.0.0 中使用 SURF 功能和 Bag of Words 方法来查找匹配项。我需要一种方法来确定查询图像是否在库中有匹配项。如果是,我想知道最接近匹配的图像的索引。

这是我读取所有图像(图像库中总共 300 张)并提取和聚类特征的代码:

Mat training_descriptors(1, extractor->descriptorSize(), extractor->descriptorType());
//read in all images and set to binary
char filepath[1000];
for (int i = 1; i < trainingSetSize; i++){
    cout << "in for loop, iteration: " << i << endl;
    _snprintf_s(filepath, 100, "C:/Users/Randal/Desktop/TestCase1Training/%d.bmp", i);
    Mat temp = imread(filepath, CV_LOAD_IMAGE_GRAYSCALE);
    Mat tempBW;
    adaptiveThreshold(temp, tempBW, 255, ADAPTIVE_THRESH_GAUSSIAN_C, THRESH_BINARY, 11, 2);
    detector->detect(tempBW, keypoints1);
    extractor->compute(tempBW, keypoints1, descriptors1);
    training_descriptors.push_back(descriptors1);
    cout << "descriptors added" << endl;

}
cout << "Total descriptors: " << training_descriptors.rows << endl;
trainer.add(training_descriptors);

Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("FlannBased");
BOWImgDescriptorExtractor BOW(extractor, matcher);
Mat library = trainer.cluster();
BOW.setVocabulary(library);

我写了下面的代码试图找到一个匹配项。问题是 BOW.compute 只有 returns 存在于图像和图像库中的聚类(单词)的索引。 imgQ 是查询图片。

Mat output;
Mat imgQBW;
adaptiveThreshold(imgQ, imgQBW, 255, ADAPTIVE_THRESH_GAUSSIAN_C, THRESH_BINARY, 11, 2);
imshow("query image", imgQBW);
detector->detect(imgQBW, keypoints2);
extractor->compute(imgQBW, keypoints2, descriptors2);

BOW.compute(imgQBW, keypoints1, output);
cout << output.row(0) << endl;

我需要知道 BoW 中的哪些簇对应于哪些图像。我现在的输出——output.row(0)——只是一个数组,其中包含在库中找到的所有簇索引。我误解了这个输出吗?有没有办法确定哪个图像具有最匹配的簇?

我也根据这段代码做了类似的事情:

https://github.com/royshil/FoodcamClassifier/blob/master/training_common.cpp

但是上面的部分是在聚类完成之后。 您需要做的是使用您的 ML(我使用 SVM)和您的集群中心进行训练,即您拥有的视觉词袋。 此外,您需要找到所有 "closest" 点到您的聚类点并使用直方图训练它们。接下来,您将获得需要训练的频率直方图(关键点包)。

Ptr<ifstream> ifs(new ifstream("training.txt"));
int total_samples_in_file = 0;
vector<string> classes_names;
vector<string> lines; 

//read from the file - ifs and put into a vector
for(int i=0;i<lines.size();i++) {

    vector<KeyPoint> keypoints;
    Mat response_hist;
    Mat img;
    string filepath;

    string line(lines[i]);
    istringstream iss(line);

    iss >> filepath;

    string class_to_train; 
    iss >> class_to_train; 
    class_ml = "class_" + class_to_train;
    if(class_ml.size() == 0) continue;

    img = imread(filepath);

    detector->detect(img,keypoints);
    bowide.compute(img, keypoints, response_hist);

    cout << "."; cout.flush();
    //here create the logic for the class to train(class_0, e.g) and the data you need to train.
}

您可以在此 git 项目中找到更多信息:
https://github.com/royshil/FoodcamClassifier
此处的文档:
http://www.morethantechnical.com/2011/08/25/a-simple-object-classifier-with-bag-of-words-using-opencv-2-3-w-code/