尝试在 Keras 中可视化 CNN 预测层的最大激活时出现矩阵大小错误
Matrix size error when trying to visualize maximum activation of CNN prediction layer in Keras
受 François Chollet 的书“使用 Python 进行深度学习”(第一版)的启发,我正在尝试生成一张最大化 VGG16 模型预测的图片。
此处描述了中间层的原始过程(从单元格 12 开始):
本质上,这涉及输入图像的梯度下降:
import keras, matplotlib.pyplot as plt, numpy as np
from keras import backend as K, models
from keras.applications.vgg16 import decode_predictions, preprocess_input, VGG16
from keras.models import load_model
from keras.preprocessing import image
model = VGG16(weights='imagenet')
layer_name = 'block3_conv1'
filter_index = 0
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
grads = K.gradients(loss, model.input)[0]
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
iterate = K.function([model.input], [loss, grads])
loss_value, grads_value = iterate([np.zeros((1, 150, 150, 3))])
为了在最终预测中重现这一点,我认为最后一层渲染了一个千维向量(对应于 VGG16 情况下的 1000 类),但只需要最大化一个索引,对“猫”说 285。
据此,我稍微修改了代码:
layer_pred_name = 'predictions'
pred_index = 285
layer_pred_output = model.get_layer(layer_pred_name).output
loss_pred = K.mean(layer_pred_output[:, pred_index])
grads_pred = K.gradients(loss_pred, model.input)[0]
grads_pred /= (K.sqrt(K.mean(K.square(grads_pred))) + 1e-5)
iterate_pred = K.function([model.input], [loss_pred, grads_pred])
loss_pred_value, grads_pred_value = iterate_pred([np.zeros((1, 150, 150, 3))])
然而,不幸的是我得到了以下错误:
InvalidArgumentError: Matrix size-incompatible: In[0]: [1,8192], In[1]: [25088,4096]
[[{{node fc1/MatMul}} = MatMul[T=DT_FLOAT, _class=["loc:@gradients_1/fc1/MatMul_grad/MatMul"], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flatten/Reshape, fc1/kernel/read)]]
其实尺寸好像很合适,所以看不懂错误。
任何有关如何解决此问题的想法都将不胜感激。
最后,我通过编写自己的随机搜索函数找到了解决此问题的方法,该函数可以最小化给定预测的预测差异:
def prediction_leastquares(input1, input2):
leastsquare = 0
for idx in range(len(input1)):
leastsquare = leastsquare + (input1[idx] - input2[idx])**2
leastsquare = leastsquare**(1/2)
return leastsquare
opt_pred = np.zeros(1000)
opt_pred[285] = 1
x2 = np.zeros(x.shape) + 100
x2 = np.array(x2)
predsdiff2 = 2
for i in range(10000):
preds2 = model.predict(x2)
x1 = x2.copy()
x1 = x1 + np.random.normal(loc=0.0, scale=1, size=[1, x1.shape[1], x1.shape[2], 3])
preds1 = model.predict(x1)
predsdiff1 = prediction_leastquares(preds1[0], opt_pred)
if (predsdiff1 < predsdiff2):
predsdiff2 = predsdiff1
x2 = x1.copy()
最终输出是一张外观随机的图像,被归类为“猫”,置信度非常高 - 动手对抗攻击。
Optimized picture classified as cat by VGG16
受 François Chollet 的书“使用 Python 进行深度学习”(第一版)的启发,我正在尝试生成一张最大化 VGG16 模型预测的图片。
此处描述了中间层的原始过程(从单元格 12 开始):
本质上,这涉及输入图像的梯度下降:
import keras, matplotlib.pyplot as plt, numpy as np
from keras import backend as K, models
from keras.applications.vgg16 import decode_predictions, preprocess_input, VGG16
from keras.models import load_model
from keras.preprocessing import image
model = VGG16(weights='imagenet')
layer_name = 'block3_conv1'
filter_index = 0
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
grads = K.gradients(loss, model.input)[0]
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
iterate = K.function([model.input], [loss, grads])
loss_value, grads_value = iterate([np.zeros((1, 150, 150, 3))])
为了在最终预测中重现这一点,我认为最后一层渲染了一个千维向量(对应于 VGG16 情况下的 1000 类),但只需要最大化一个索引,对“猫”说 285。
据此,我稍微修改了代码:
layer_pred_name = 'predictions'
pred_index = 285
layer_pred_output = model.get_layer(layer_pred_name).output
loss_pred = K.mean(layer_pred_output[:, pred_index])
grads_pred = K.gradients(loss_pred, model.input)[0]
grads_pred /= (K.sqrt(K.mean(K.square(grads_pred))) + 1e-5)
iterate_pred = K.function([model.input], [loss_pred, grads_pred])
loss_pred_value, grads_pred_value = iterate_pred([np.zeros((1, 150, 150, 3))])
然而,不幸的是我得到了以下错误:
InvalidArgumentError: Matrix size-incompatible: In[0]: [1,8192], In[1]: [25088,4096]
[[{{node fc1/MatMul}} = MatMul[T=DT_FLOAT, _class=["loc:@gradients_1/fc1/MatMul_grad/MatMul"], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flatten/Reshape, fc1/kernel/read)]]
其实尺寸好像很合适,所以看不懂错误。 任何有关如何解决此问题的想法都将不胜感激。
最后,我通过编写自己的随机搜索函数找到了解决此问题的方法,该函数可以最小化给定预测的预测差异:
def prediction_leastquares(input1, input2):
leastsquare = 0
for idx in range(len(input1)):
leastsquare = leastsquare + (input1[idx] - input2[idx])**2
leastsquare = leastsquare**(1/2)
return leastsquare
opt_pred = np.zeros(1000)
opt_pred[285] = 1
x2 = np.zeros(x.shape) + 100
x2 = np.array(x2)
predsdiff2 = 2
for i in range(10000):
preds2 = model.predict(x2)
x1 = x2.copy()
x1 = x1 + np.random.normal(loc=0.0, scale=1, size=[1, x1.shape[1], x1.shape[2], 3])
preds1 = model.predict(x1)
predsdiff1 = prediction_leastquares(preds1[0], opt_pred)
if (predsdiff1 < predsdiff2):
predsdiff2 = predsdiff1
x2 = x1.copy()
最终输出是一张外观随机的图像,被归类为“猫”,置信度非常高 - 动手对抗攻击。
Optimized picture classified as cat by VGG16