python 的深度学习(余弦相似度)
Deep learning with python (Cosine similarity)
我正在学习如何使用 VGG16 模型识别相似物体。我创建了一个文件夹“images”,里面可以放一些.jpg
。
但我对程序的 cosine_similarity 部分感到困惑。
cosine_similarity 功能是将"images" 文件夹中的所有jpg 转换为Eigenvector 并相互比较。当值趋于1时,它们更相似。
但是我不明白下面的代码,
sim = ratings.dot(ratings.T)
为什么 jpg 与自身比较(转置)而不是其他?
有人可以向我解释一下下面的 cosine_similarity 吗?
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np
import os
import sys
#Calculate similar matrics
def cosine_similarity(ratings):
sim = ratings.dot(ratings.T)
if not isinstance(sim,np.ndarray):
sim = sim.toarray()
norms = np.array([np.sqrt(np.diagonal(sim))])
return (sim/norms/norms.T)
def main():
#from "folder-->image" find all of JPEG files
y_test=[]
x_test=[]
for img_path in os.listdir("C:\Users\Desktop\Python\ML\CNN model\VGG16\images"):
if img_path.endswith(".jpg"):
img = image.load_img("C:\Users\Desktop\Python\ML\CNN model\VGG16\images\" + img_path, target_size=(224,224))
y_test.append(img_path[0:4])
x = image.img_to_array(img)
x= np.expand_dims(x,axis=0)
if len(x_test) > 0:
x_test = np.concatenate((x_test,x))
else:
x_test = x
#Convert to VGG input format
x_test = preprocess_input(x_test)
#include_top=False == not getting VGG16 last 3 layers
model = VGG16(weights = "imagenet", include_top=False)
#Get features
features = model.predict(x_test)
#Calculate similar metrics
features_compress = features.reshape(len(y_test), 7*7*512)
sim = cosine_similarity(features_compress)
#
inputNo = int(sys.argv[1])
top = np.argsort(-sim[inputNo], axis=0)[1:3]
#get the first 2 most similar index
recommend = [y_test[i] for i in top]
print(recommend)
if __name__ == "__main__":
main()
Why is jpg is comparing to itself (in transpose) but not others?
sim = cosine_similarity(features_compress)
所以在这里,我认为 features_compress
是 x_test 中包含的所有图像的特征集,而不是单个图像。
因为在前面的 for 循环中,这就是您使用 np.concatenate()
所做的事情。
如果情况确实如此,那么可以将 cosine_similarity()
返回的结果想象成一个矩阵,告诉您每个图像与其他图像的相似性。
我正在学习如何使用 VGG16 模型识别相似物体。我创建了一个文件夹“images”,里面可以放一些.jpg
。
但我对程序的 cosine_similarity 部分感到困惑。 cosine_similarity 功能是将"images" 文件夹中的所有jpg 转换为Eigenvector 并相互比较。当值趋于1时,它们更相似。
但是我不明白下面的代码,
sim = ratings.dot(ratings.T)
为什么 jpg 与自身比较(转置)而不是其他?
有人可以向我解释一下下面的 cosine_similarity 吗?
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np
import os
import sys
#Calculate similar matrics
def cosine_similarity(ratings):
sim = ratings.dot(ratings.T)
if not isinstance(sim,np.ndarray):
sim = sim.toarray()
norms = np.array([np.sqrt(np.diagonal(sim))])
return (sim/norms/norms.T)
def main():
#from "folder-->image" find all of JPEG files
y_test=[]
x_test=[]
for img_path in os.listdir("C:\Users\Desktop\Python\ML\CNN model\VGG16\images"):
if img_path.endswith(".jpg"):
img = image.load_img("C:\Users\Desktop\Python\ML\CNN model\VGG16\images\" + img_path, target_size=(224,224))
y_test.append(img_path[0:4])
x = image.img_to_array(img)
x= np.expand_dims(x,axis=0)
if len(x_test) > 0:
x_test = np.concatenate((x_test,x))
else:
x_test = x
#Convert to VGG input format
x_test = preprocess_input(x_test)
#include_top=False == not getting VGG16 last 3 layers
model = VGG16(weights = "imagenet", include_top=False)
#Get features
features = model.predict(x_test)
#Calculate similar metrics
features_compress = features.reshape(len(y_test), 7*7*512)
sim = cosine_similarity(features_compress)
#
inputNo = int(sys.argv[1])
top = np.argsort(-sim[inputNo], axis=0)[1:3]
#get the first 2 most similar index
recommend = [y_test[i] for i in top]
print(recommend)
if __name__ == "__main__":
main()
Why is jpg is comparing to itself (in transpose) but not others?
sim = cosine_similarity(features_compress)
所以在这里,我认为 features_compress
是 x_test 中包含的所有图像的特征集,而不是单个图像。
因为在前面的 for 循环中,这就是您使用 np.concatenate()
所做的事情。
如果情况确实如此,那么可以将 cosine_similarity()
返回的结果想象成一个矩阵,告诉您每个图像与其他图像的相似性。