获取破坏图像所需的梯度值

Question

我一直在试验对抗性图像，我从以下 link https://arxiv.org/pdf/1412.6572.pdf...

中阅读了 fast gradient sign method

说明书上解释说，必要的梯度可以用backpropagation来计算...

我已经成功地生成了对抗性图像，但我未能尝试提取创建对抗性图像所需的梯度。我会证明我的意思。

让我们假设我已经使用 logistic regression 训练了我的算法。我 restore 模型，我提取了我希望更改为对抗图像的数字。在这种情况下，它是数字 2...

# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits)
...
...
# assign the images of number 2 to the variable
sess.run(tf.assign(x, labels_of_2))
# setup softmax
sess.run(pred)

# placeholder for target label
fake_label = tf.placeholder(tf.int32, shape=[1])
# setup the fake loss
fake_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=fake_label)

# minimize fake loss using gradient descent,
# calculating the derivatives of the weight of the fake image will give the direction of weights necessary to change the prediction
adversarial_step = tf.train.GradientDescentOptimizer(learning_rate=FLAGS.learning_rate).minimize(fake_loss, var_list=[x])

# continue calculating the derivative until the prediction changes for all 10 images
for i in range(FLAGS.training_epochs):
    # fake label tells the training algorithm to use the weights calculated for number 6
    sess.run(adversarial_step, feed_dict={fake_label:np.array([6])})
    sess.run(pred)

这是我的方法，而且效果很好。它采用我的 2 号图像并仅对其进行轻微更改，以便当我运行以下...

x_in = np.expand_dims(x[0], axis=0)
classification = sess.run(tf.argmax(pred, 1))
print(classification)

它将把数字 2 预测为数字 6。

问题是，我需要提取必要的梯度来欺骗神经网络，让神经网络认为数字 2 是 6。我需要使用这个梯度来创建上面提到的 nematode。

我不确定如何提取梯度值。我试着查看 tf.gradients 但我无法弄清楚如何使用此功能生成对抗图像。我在上面的 fake_loss 变量之后实现了以下...

tf.gradients(fake_loss, x)

for i in range(FLAGS.training_epochs):
    # calculate gradient with weight of number 6
    gradient_value = sess.run(gradients, feed_dict={fake_label:np.array([6])})
    # update the image of number 2
    gradient_update = x+0.007*gradient_value[0]
    sess.run(tf.assign(x, gradient_update))
    sess.run(pred)

不幸的是，预测并没有按照我想要的方式改变，而且这种逻辑导致图像相当模糊。

如果我能解释一下我需要做什么才能计算和提取将欺骗神经网络的梯度，那么如果我要采用该梯度并将其作为 nematode，这将导致不同的预测。

Answer 1

为什么不让 Tensorflow 优化器为您的图像添加渐变？您仍然可以评估线虫以获得添加的结果梯度。

我创建了一些示例代码来用熊猫图像进行演示。它使用 VGG16 神经网络将您自己的熊猫图像转换为 "goldfish" 图像。每 100 次迭代，它会将图像保存为 PDF，因此您可以无损打印它以检查您的图像是否仍然是金鱼。

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipyd
from libs import vgg16 # Download here! https://github.com/pkmital/CADL/tree/master/session-4/libs

pandaimage = plt.imread('panda.jpg')
pandaimage = vgg16.preprocess(pandaimage)
plt.imshow(pandaimage)

img_4d = np.array([pandaimage])

g = tf.get_default_graph()
input_placeholder = tf.Variable(img_4d,trainable=False)
to_add_image = tf.Variable(tf.random_normal([224,224,3], mean=0.0, stddev=0.1, dtype=tf.float32))
combined_images_not_clamped = input_placeholder+to_add_image

filledmax = tf.fill(tf.shape(combined_images_not_clamped), 1.0)
filledmin = tf.fill(tf.shape(combined_images_not_clamped), 0.0)
greater_than_one = tf.greater(combined_images_not_clamped, filledmax)

combined_images_with_max = tf.where(greater_than_one, filledmax, combined_images_not_clamped)
lower_than_zero =tf.less(combined_images_with_max, filledmin)
combined_images = tf.where(lower_than_zero, filledmin, combined_images_with_max)

net = vgg16.get_vgg_model()
tf.import_graph_def(net['graph_def'], name='vgg')
names = [op.name for op in g.get_operations()]

style_layer = 'prob:0'
the_prediction = tf.import_graph_def(
    net['graph_def'],
    name='vgg',
    input_map={'images:0': combined_images},return_elements=[style_layer])

goldfish_expected_np = np.zeros(1000)
goldfish_expected_np[1]=1.0
goldfish_expected_tf = tf.Variable(goldfish_expected_np,dtype=tf.float32,trainable=False)
loss = tf.reduce_sum(tf.square(the_prediction[0]-goldfish_expected_tf))
optimizer = tf.train.AdamOptimizer().minimize(loss)


sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())


def show_many_images(*images):
    fig = plt.figure()
    for i in range(len(images)):
        print(images[i].shape)
        subplot_number = 100+10*len(images)+(i+1)
        plt.subplot(subplot_number)
        plt.imshow(images[i])
    plt.show()



for i in range(1000):
    _, loss_val = sess.run([optimizer,loss])

    if i%100==1:
        print("Loss at iteration %d: %f" % (i,loss_val))
        _, loss_val,adversarial_image,pred,nematode = sess.run([optimizer,loss,combined_images,the_prediction,to_add_image])
        res = np.squeeze(pred)
        average = np.mean(res, 0)
        res = res / np.sum(average)
        plt.imshow(adversarial_image[0])
        plt.show()
        print([(res[idx], net['labels'][idx]) for idx in res.argsort()[-5:][::-1]])
        show_many_images(img_4d[0],nematode,adversarial_image[0])
        plt.imsave('adversarial_goldfish.pdf',adversarial_image[0],format='pdf') # save for printing

如果这对你有帮助，请告诉我！

获取破坏图像所需的梯度值

Get gradient value necessary to break an image

neural-network

python-3.x

tensorflow

adversarial-machines