在 Keras 中跨两个平行层的反向传播

Backpropagation across two parallel layers in Keras

我想创建一个具有两个并行层的网络(将相同的输入提供给两个不同的层,并将它们的输出与一些数学运算相结合)。话虽如此,我不确定反向传播是否会由 Keras 自动完成。作为自定义 RNN 单元格的简单示例,

class Example(keras.layers.Layer):

    def __init__(self, units, **kwargs):
        super(Example, self).__init__(**kwargs)
        self.units = units
        self.state_size = units
        self.la = keras.layers.Dense(self.units)
        self.lb = keras.layers.Dense(self.units)

    def call(self, inputs, states):
        prev_output = states[0]
        # parallel layers
        a = tf.sigmoid(self.la(inputs)) 
        b = tf.sigmoid(self.lb(inputs))
        # combined using mathematical operation
        output = (-1 * prev_output * a) + (prev_output * b)
        return output, [output]

Now, the loss gradient to `la` and `lb` layers are different (gradient of loss wrt `a`, should be `-output` but wrt `b` should be `output`), will this be taken care by Keras automatically or should we create custom gradient functions? 
Any insights and suggestions are much appreciated :)

检查

的答案

只要所有计算都由张量对象链接,Keras 就会处理反向传播,即不要将张量转换为数组等其他类型,所以不用担心。

以渐变胶带为例,您可以通过以下方式查看每一层的渐变:

gradients = grad_tape.gradient(total_loss, model.trainable_variables)
gradient_of_last_layer = tf.reduce_max(gradients[-1])