TypeError: An op outside of the function building code is being passed a Graph tensor

TypeError: An op outside of the function building code is being passed a Graph tensor

我收到以下异常

TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  @tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: conv2d_flipout/divergence_kernel:0

这也会引发以下异常

tensorflow.python.eager.core._SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'conv2d_flipout/divergence_kernel:0' shape=() dtype=float32>]

当运行宁以下代码

from __future__ import print_function

import tensorflow as tf
import tensorflow_probability as tfp


def get_bayesian_model(input_shape=None, num_classes=10):
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Input(shape=input_shape))
    model.add(tfp.layers.Convolution2DFlipout(6, kernel_size=5, padding="SAME", activation=tf.nn.relu))
    model.add(tf.keras.layers.Flatten())
    model.add(tfp.layers.DenseFlipout(84, activation=tf.nn.relu))
    model.add(tfp.layers.DenseFlipout(num_classes))
    return model

def get_mnist_data(normalize=True):
    img_rows, img_cols = 28, 28
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

    if tf.keras.backend.image_data_format() == 'channels_first':
        x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
        x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
        input_shape = (1, img_rows, img_cols)
    else:
        x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
        x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
        input_shape = (img_rows, img_cols, 1)

    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')

    if normalize:
        x_train /= 255
        x_test /= 255

    return x_train, y_train, x_test, y_test, input_shape


def train():
    # Hyper-parameters.
    batch_size = 128
    num_classes = 10
    epochs = 1

    # Get the training data.
    x_train, y_train, x_test, y_test, input_shape = get_mnist_data()

    # Get the model.
    model = get_bayesian_model(input_shape=input_shape, num_classes=num_classes)

    # Prepare the model for training.
    model.compile(optimizer=tf.keras.optimizers.Adam(), loss="sparse_categorical_crossentropy",
                  metrics=['accuracy'])

    # Train the model.
    model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1)
    model.evaluate(x_test, y_test, verbose=0)


if __name__ == "__main__":
    train()

问题显然与图层 tfp.layers.Convolution2DFlipout 有关。为什么我会收到这些异常?这是由于我的代码中的逻辑错误还是可能是 TensorFlow 或 TensorFlow Probability 中的错误?这些错误是什么意思?我该如何解决?

我正在使用 TensorFlow 2.0.0(默认情况下急切执行)。和 TensorFlow Probability 0.8.0 和 Python 3.7.4。我也打开了相关的issue here and here.

请不要建议我使用 TensorFlow 1 来懒惰地执行我的代码(也就是说,在导入 TensorFlow 后使用 tf.compat.v1.disable_eager_execution(),因为我知道这会使运行 上面的代码没有得到提到的异常)或显式创建会话或占位符。

这个问题可以通过将 compile 方法的参数 experimental_run_tf_function 设置为 False 来部分解决,正如我在 comment to the Github issue I had opened.[=23 中写的那样=]

但是,如果将 experimental_run_tf_function 设置为 False 并尝试使用 predict 方法,则会出现另一个错误。参见 this Github issue


编辑 (28/09/2020)

experimental_run_tf_function 已在最新版本的 TF 中删除。然而,在最新版本的 TFP 中(下面列出了我使用的具体版本),贝叶斯卷积层(至少,使用 Flipout 估计器的层)的问题已得到修复。参见 https://github.com/tensorflow/probability/issues/620#issuecomment-620821990 and https://github.com/tensorflow/probability/commit/1574c1d24c5dfa52bdf2387a260cd63a327b1839

具体我使用了以下版本

tensorflow==2.3.0
tensorflow-probability==0.11.0

而且我同时使用了密集贝叶斯层和卷积贝叶斯层,我在调用 compile.

没有 使用 experimental_run_tf_function=False