使用 ImageMagick 获取图像的模糊散列

Using ImageMagick to get a fuzzy hash of an image

我有很多图像的情况，我使用特定的模糊因子（比如 10%）比较它们，寻找匹配的图像。工作正常。

但是，有时我会遇到一种情况，我想将所有图像与所有其他图像进行比较（例如 1000 张图像）。进行 5000 多次 ImageMagick 比较太慢了。

散列所有文件并比较散列值 5000 次快如闪电，但当然只有在图像相同（无模糊因素）时才有效。

我想知道是否有某种方法可以生成 ID 或指纹 - 或者可能是一系列 ID - 我可以在其中快速确定哪些图像彼此足够接近，然后支付 ImageMagick 比较费用仅适用于那些可能的比赛。非常欢迎现有 algorithms/approaches 的想法或名称。

那里有很多成像哈希算法。 pHash 是我最先想到的那个。 http://www.phash.org/. That one works with basic transformations that one might want to do on an image. If you want to be more sophisticated and roll your own, you can use a pre-trained image classifier like image net (https://www.learnopencv.com/keras-tutorial-using-pre-trained-imagenet-models/), lop off the final layer, and use the penultimate layer as a vector. For small # of images, you can easily do a nearest neighbor. If you have more, you cam use annoy (https://github.com/spotify/annoy) 使最近邻搜索更有效

使用 ImageMagick 获取图像的模糊散列

Using ImageMagick to get a fuzzy hash of an image

hash

imagemagick