将 GradientTape 用于具有字典输入的 tf.keras 神经网络（由多个模型组成）

Question

我需要从 Tensorflow/Keras 2.0 (super_model) 中实现的神经网络中提取导数。由于我之前在中解释的问题，该模型由多个基本模型（x1 到 x6）组成。（因此，如果只将角度传递给模型，我会得到一个错误。）见以下代码：

angles=[0] * 21

data = {
    'x1_model_input': numpy.array([angles[0:3]]),
    'x2_model_input': numpy.array([angles[3:6]]),
    'x3_model_input': numpy.array([[angles[6]]]), 
    'x4_model_input': numpy.array([angles[7:13]]), 
    'x5_model_input': numpy.array([angles[13:15]]), 
    'x6_model_input': numpy.array([angles[15:21]])
}

# this super_model prediction is working well
pred = super_model.predict(data) # `pred` shape is `shape=(1,1)`

现在，我需要使用 GradientTape 根据输入数据对网络进行导数。我尝试了以下方法，旨在获取上述指定数据的网络梯度值：

with tf.GradientTape() as tape:
    pred = super_model(data)
# does not work as `data` is a dictionary
# the error is:
#         ...
#         return pywrap_tfe.TFE_Py_TapeGradient(
#     AttributeError: 'numpy.ndarray' object has no attribute '_id'
grad = tape.gradient(pred, data)

但是，data 是一个字典，我不能调用 tape.watch 然后调用 gradient。我也不能在 data 上调用 tf.convert_to_tesnor，因为它是一本字典。所以，我的问题是如何在不改变 super_model?

结构的情况下继续工作

Answer 1

我不确定它是否适合您，但您的代码可以使用 tf.Variable 而不是 numpy:

import tensorflow as tf

angles=[0] * 21

test_tensor = tf.Variable([angles[0:3]], dtype=tf.float32)
data = {
    'x1_model_input': test_tensor,
    'x2_model_input': tf.Variable([angles[3:6]], dtype=tf.float32),
    'x3_model_input': tf.Variable([[angles[6]]], dtype=tf.float32), 
    'x4_model_input': tf.Variable([angles[7:13]], dtype=tf.float32), 
    'x5_model_input': tf.Variable([angles[13:15]], dtype=tf.float32), 
    'x6_model_input': tf.Variable([angles[15:21]], dtype=tf.float32)
}

with tf.GradientTape() as tape:
    pred = tf.constant([[1.0]]) * test_tensor

grad = tape.gradient(pred, data) 
tf.print(grad)

{'x1_model_input': [[1 1 1]],
 'x2_model_input': None,
 'x3_model_input': None,
 'x4_model_input': None,
 'x5_model_input': None,
 'x6_model_input': None}

将 GradientTape 用于具有字典输入的 tf.keras 神经网络（由多个模型组成）

Using GradientTape for A tf.keras Neural Network with dictionary input (composed from multiple models)

python

gradient-descent

tensorflow

tf.keras

tensorflow2.0