Python/OpenCV 如何处理.yaml 文件中的大量SIFT 特征？

Question

我正在使用 OpenCv 并使用 yaml 来存储 SIFT 关键点和描述符。我有一个包含 1659 张图片的数据库（.jpg，每张图片大约 95 KB）。对于每个图像，我创建了一个包含关键点和描述符的 .yml 文件。现在，对于单个图像，我最终得到了 700 个关键点和描述符，生成了一个 ca 文件。 4MB，我想避免使用二进制文件。
我的问题是：

How can I know if the number of features is adequate to the image?

There is any way to control the number of features? For example, setting a threshold for SIFT?

Now storing a numpy matrix into a yamil file using cv2.FileStorage.write, OpenCv writes each number with a 16 significant digits (ex. 1.9705572128295898e+00). Is there a problem if I reduce the significant digits? For example to 4?

Answer 1

How can I know if the number of features is adequate to the image?

一定要看你的形象，你的任务要求。你应该比别人更清楚，或者做实验才能弄清楚。

There is any way to control the number of features?

当然可以。创建时传入必要的参数即可。

cv2.xfeatures2d.SIFT_create([, nfeatures[, nOctaveLayers[, contrastThreshold[, edgeThreshold[, sigma]]]]]) -> retval
 | . @param nfeatures The number of best features to retain. The features are ranked by their scores
 | . (measured in SIFT algorithm as the local contrast)
 | .
 | . @param nOctaveLayers The number of layers in each octave. 3 is the value used in D. Lowe paper. 
 | . The number of octaves is computed automatically from the image resolution.
 | .
 | . @param contrastThreshold The contrast threshold used to filter out weak features in semi-uniform
 | . (low-contrast) regions. The larger the threshold, the less features are produced by the detector.
 | .
 | . @param edgeThreshold The threshold used to filter out edge-like features. Note that the its meaning
 | . is different from the contrastThreshold, i.e. the larger the edgeThreshold, the less features are
 | . filtered out (more features are retained).
 | .
 | . @param sigma The sigma of the Gaussian applied to the input image at the octave \#0. If your image
 | . is captured with a weak camera with soft lenses, you might want to reduce the number.
 |

例如，我创建了一个具有 50 个关键点和 3 层的筛选检测器：

sift = cv2.xfeatures2d.SIFT_create(nfeatures = 50, nOctaveLayers=3)

这是检测结果：

Too long. I know you stored large number keypoints and descriptors into .yml format in OpenCV-Python.

好的，当您要存储大量数据时，.yml 真的有用吗？真的合理吗？你真的需要 keypoint (points2f, size, response, octave, class_id) 的每一个元素吗？至于描述符，它是一个直方图，或者一个int数组。所以即使存为int，值也刚刚好。

Python/OpenCV 如何处理.yaml 文件中的大量SIFT 特征？

Python/OpenCV how to deal with large number of SIFT feaure in .yml file?

python

opencv

yaml

numpy

sift