如何检查具有不同像素化的两个图像的相似性

How to check similarity of two images that have different pixelization

我 运行 一个 python 代码来检查 Quora 和 Twitter 用户个人资料照片的相似性,但是当图像相同时我没有得到肯定的结果。

这是比较两张图片的代码:

path_photo_quora= "/home/yousuf/Desktop/quora_photo.jpg"
path_photo_twitter="/home/yousuf/Desktop/twitter_photo.jpeg"
if open(path_photo_quora,"rb").read() == open(path_photo_twitter,"rb").read():
     print('photos profile are identical')

尽管图像相同,但控制台不打印 "photos profile are identical",我该怎么办?

这两张图片不一样 - 只有图片中的东西。正如您自己注意到的那样,这些图像显然大小不同。因此比较必须失败。

您需要进行某种相似性检查。第一步是将较小的图像放大到较大的图像。然后,您需要采用某种检测和定义相似性的方法。有不同的方式和方法,它们的任意组合都可能有效。

例如见Checking images for similarity with OpenCV

您可以使用 imagehash 库来比较相似的图像。

from PIL import Image
import imagehash
hash0 = imagehash.average_hash(Image.open('quora_photo.jpg')) 
hash1 = imagehash.average_hash(Image.open('twitter_photo.jpeg')) 
cutoff = 5  # maximum bits that could be different between the hashes. 

if hash0 - hash1 < cutoff:
  print('images are similar')
else:
  print('images are not similar')

由于图像不完全一样,会有一些差异,所以我们使用一个最大差异可以接受的截止值。哈希对象之间的差异是翻转的位数。但即使调整图像大小、压缩、不同文件格式或调整对比度或颜色,imagehash 也能正常工作。

散列(或实际上是指纹)源自图像的 8x8 单色缩略图。但即使样本如此减少,相似性比较也能给出相当准确的结果。调整截止值以找到可接受的误报和漏报之间的平衡。

对于 64 位哈希,差异为 0 表示哈希相同。相差 32 意味着完全没有相似性。相差 64 意味着一个散列与另一个完全相反。

import cv2

class CompareImage(object):

    def __init__(self, image_1_path, image_2_path):
        self.minimum_commutative_image_diff = 1
        self.image_1_path = image_1_path
        self.image_2_path = image_2_path

    def compare_image(self):
        image_1 = cv2.imread(self.image_1_path, 0)
        image_2 = cv2.imread(self.image_2_path, 0)
        commutative_image_diff = self.get_image_difference(image_1, image_2)

        if commutative_image_diff < self.minimum_commutative_image_diff:
            print "Matched"
            return commutative_image_diff
        return 10000 //random failure value

    @staticmethod
    def get_image_difference(image_1, image_2):
        first_image_hist = cv2.calcHist([image_1], [0], None, [256], [0, 256])
        second_image_hist = cv2.calcHist([image_2], [0], None, [256], [0, 256])

        img_hist_diff = cv2.compareHist(first_image_hist, second_image_hist, cv2.HISTCMP_BHATTACHARYYA)
        img_template_probability_match = cv2.matchTemplate(first_image_hist, second_image_hist, cv2.TM_CCOEFF_NORMED)[0][0]
        img_template_diff = 1 - img_template_probability_match

        # taking only 10% of histogram diff, since it's less accurate than template method
        commutative_image_diff = (img_hist_diff / 10) + img_template_diff
        return commutative_image_diff


    if __name__ == '__main__':
        compare_image = CompareImage('image1/path', 'image2/path')
        image_difference = compare_image.compare_image()
        print image_difference