Keras，计算 LSTM 输入的损失梯度

Question

我对机器学习还很陌生，而且我一直在研究对抗性的例子。我试图愚弄二进制字符级 LSTM 文本分类器。因此我需要损失的梯度 w.r.t。输入。

渐变函数虽然 returns None.

我已经尝试获取渐变，就像或，但梯度函数仍然 returns None.

编辑：我想做一些类似于 this git 回购中的事情。

我在想问题可能是它是一个 LSTM 分类器。我现在不确定。但我认为即使从 LSTM 分类器中也应该可以获得这些梯度，对吗？

这是我的代码：

import numpy as np
from keras.preprocessing import sequence
from keras.models import load_model
import data
import pickle
import keras.backend as K

def adversary():
    model, valid_chars = loadModel()    
    model.summary()

    #load data
    X, y, maxlen, _ , max_features, indata = prepare_data(valid_chars)

    target = y[0]

    # Get the loss and gradient of the loss wrt the inputs  
    target = np.asarray(target).astype('float32').reshape((-1,1))
    loss = K.binary_crossentropy(target, model.output)
    print(target)
    print(model.output)
    print(model.input)
    print(loss)
    grads = K.gradients(loss, model.input)

    #f = K.function([model.input], [loss, grads])

    #print(f(X[1:2]))
    print(model.predict(X[0:1]))

    print(grads)

输出如下所示：

Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 74, 128)           5120      
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 129       
_________________________________________________________________
activation_1 (Activation)    (None, 1)                 0         
=================================================================
Total params: 136,833
Trainable params: 136,833
Non-trainable params: 0
_________________________________________________________________
Maxlen: 74
Data preparing finished
[[0.]]
Tensor("activation_1/Sigmoid:0", shape=(?, 1), dtype=float32)
Tensor("embedding_1_input:0", shape=(?, 74), dtype=float32)
Tensor("logistic_loss_1:0", shape=(?, 1), dtype=float32)
[[1.1397913e-13]]
[None]

我希望得到损失的梯度w.r.t。输入数据以查看哪个字符对输出的影响最大。因此我可以通过修改相应的字符来欺骗分类器。这可能吗？如果是，我的方法有什么问题？

感谢您的宝贵时间。

Answer 1

只能为 "trainable" 张量计算梯度，因此您可能希望将输入包装到 tf.Variable()。

只要您想使用渐变，我建议您使用 tensorflow 来完成，它与 Keras 完美集成。下面是我做的例子，注意它在急切执行模式下工作（tensorflow 2.0 中的默认模式）。

def train_actor(self, sars):
    obs1, actions, rewards, obs2 = sars


    with tf.GradientTape() as tape:
        would_do_actions = self.compute_actions(obs1)
        score = tf.reduce_mean(self.critic(observations=obs1, actions=would_do_actions))
        loss = - score

    grads = tape.gradient(loss, self.actor.trainable_weights)
    self.optimizer.apply_gradients(zip(grads, self.actor.trainable_weights))

Answer 2

我刚找到 this thread。梯度函数 returns None 因为嵌入层不可微。

The embedding layer is implemented as K.gather which is not differentiable, so there is no gradient.

Keras，计算 LSTM 输入的损失梯度

Keras, calculating gradients of the loss wrt the input on an LSTM

gradient

keras

tensorflow