Why am I getting "ValueError: No gradients provided for any variable: ['Variable:0']." error?

Question

我对 tensorflow 非常陌生，我正在尝试构建一个风格转换模型，我了解模型的概念，但在实际实施时遇到困难，因为我不完全理解tensorflow 中发生了什么。当我尝试对生成的图像进行运行优化时，出现 "No gradients provided" 错误，我不明白这是因为我的代码有：

    loss = total_loss(content_feats, style_feats, output_feats)

    grad = tape.gradient(loss, output_processado)
    optimizer.apply_gradients(zip([grad],[output_processado]))

ValueError Traceback (most recent call last)

in () 8 9 grad = tape.gradient(loss, output_processado) ---> 10 optimizer.apply_gradients(zip([grad],[output_processado])) 11 12 clip = tf.clip_by_value(output_processado, min_value, max_value)

1 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in _filter_grads(grads_and_vars) 1217 if not filtered: 1218
raise ValueError("No gradients provided for any variable: %s." % -> 1219 ([v.name for _, v in grads_and_vars],)) 1220 if vars_with_empty_grads: 1221 logging.warning(

ValueError: No gradients provided for any variable: ['Variable:0'].

import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))


import numpy as np
from PIL import Image
import requests
from io import BytesIO
from keras.applications.vgg19 import VGG19
from keras.applications.vgg19 import preprocess_input
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import Model
import keras.backend as K
from matplotlib import pyplot as plt
from numpy import expand_dims
from tensorflow import GradientTape

ITERATIONS = 10
CHANNELS = 3
IMAGE_SIZE = 500
IMAGE_WIDTH = IMAGE_SIZE
IMAGE_HEIGHT = IMAGE_SIZE
CONTENT_WEIGHT = 0.02
STYLE_WEIGHT = 4.5

MEAN = np.array([103.939, 116.779, 123.68])

CONTENT_LAYERS = ['block4_conv2']
STYLE_LAYERS = ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1', 'block5_conv1']

input_image_path = "input.png"
style_image_path = "style.png"
output_image_path = "output.png"
combined_image_path = "combined.png"

san_francisco_image_path = "https://www.economist.com/sites/default/files/images/print-edition/20180602_USP001_0.jpg"

tytus_image_path = "http://meetingbenches.com/wp-content/flagallery/tytus-brzozowski-polish-architect-and-watercolorist-a-fairy-tale-in-warsaw/tytus_brzozowski_13.jpg"


input_image = Image.open(BytesIO(requests.get(san_francisco_image_path).content))
input_image = input_image.resize((IMAGE_WIDTH, IMAGE_HEIGHT))
input_image.save(input_image_path)
#input_image

# Style visualization 
style_image = Image.open(BytesIO(requests.get(tytus_image_path).content))
style_image = style_image.resize((IMAGE_WIDTH, IMAGE_HEIGHT))
style_image.save(style_image_path)
#style_image

def obter_modelo():

  modelo = VGG19(include_top = False, weights = 'imagenet', input_tensor = None)

  c_layer = CONTENT_LAYERS
  s_layers = STYLE_LAYERS

  output_layers = [modelo.get_layer(layer).output for layer in (c_layer + s_layers)]

  return Model(modelo.inputs, output_layers)

def processar_imagem(img):

  imagem = img.resize((IMAGE_HEIGHT, IMAGE_WIDTH))
  imagem = img_to_array(imagem)
  imagem = preprocess_input(imagem)
  imagem = expand_dims(imagem, axis=0)

  return imagem

def desprocessar_imagem(img):
  imagem = img
  mean = MEAN
  imagem[..., 0] += mean[0]
  imagem[..., 1] += mean[1]
  imagem[..., 2] += mean[2]
  imagem = imagem[..., ::-1]

  return imagem.astype(int)

def content_loss(c_mat, out_mat):
  return 0.5 * K.sum(K.square(out_mat - c_mat))


def matriz_gram(mat):
  return K.dot(mat,K.transpose(mat))


def style_loss(s_mat, out_mat):

  style_feat = K.batch_flatten(K.permute_dimensions(s_mat,(2,0,1)))
  output_feat = K.batch_flatten(K.permute_dimensions(out_mat,(2,0,1)))

  style_gram = matriz_gram(style_feat)
  output_gram = matriz_gram(output_feat)

  return K.sum(K.square(style_gram - output_gram)) / (4.0 * (CHANNELS ** 2) * (IMAGE_SIZE ** 2))


def total_loss(c_layer, s_layers, out_layers):

  content_layer = c_layer[0]
  out_content = out_layers[0]

  style_layers = s_layers[1:]
  out_style = out_layers[1:]

  c_loss = content_loss(content_layer[0], out_content[0])

  s_loss = None

  for i in range(len(style_layers)):
    if s_loss is None:
      s_loss = style_loss(style_layers[i][0], out_style[i][0])

    else:
      s_loss += style_loss(style_layers[i][0], out_style[i][0])

  return CONTENT_WEIGHT * c_loss + (STYLE_WEIGHT * s_loss)/len(style_layers)

modelo = obter_modelo()

#content image
content_processado = processar_imagem(input_image)
content_feats = modelo(K.variable(content_processado))

#style image
style_processado = processar_imagem(style_image)
style_feats = modelo(K.variable(style_processado))

#output image
output_processado = preprocess_input(np.random.uniform(0,250,(IMAGE_HEIGHT, IMAGE_WIDTH,CHANNELS)))
output_processado = expand_dims(output_processado, axis=0)
output_processado = K.variable(output_processado)

optimizer = tf.optimizers.Adam(5,beta_1=.99,epsilon=1e-3)
epochs=200

melhor_loss = K.variable(2000000.0)
melhor_imagem = None

min_value = MEAN
max_value = 255 + MEAN
loss = K.variable(0.0)

for e in range(epochs):
  with tf.GradientTape() as tape:
    tape.watch(output_processado)
    output_feats = modelo(output_processado)

    loss = total_loss(content_feats, style_feats, output_feats)

    grad = tape.gradient(loss, output_processado)
    optimizer.apply_gradients(zip([grad],[output_processado]))

    clip = tf.clip_by_value(output_processado, min_value, max_value)
    output_processado.assign(clip)
    print("Epoch: " + str(e) )

Answer 1

对于tape.gradient，你必须通过(loss, model.trainable_weights)，但你正在通过tape.gradient(loss, output_processado)。同样对于 optimizer.apply_gradients，您必须通过 (grad, model.trainable_variables)，但您正在通过 (zip([grad],[output_processado])。

在 GradientTape 范围内调用模型使您能够检索层的可训练权重相对于损失值的梯度。使用优化器实例，您可以使用这些梯度来更新这些变量（您可以使用 model.trainable_weights 检索）。

TensorFlow 提供 tf.GradientTape API 自动微分 - 计算计算相对于输入变量的梯度。 Tensorflow "records" 在 tf.GradientTape 的上下文中执行的所有操作到 "tape"。 Tensorflow 然后使用该磁带和与每个记录的操作相关的梯度来计算使用反向模式微分的 "recorded" 计算的梯度。

如果您想在应用渐变之前处理它们，您可以分三步使用优化器：

用tf.GradientTape计算梯度。
根据需要处理渐变。
使用 apply_gradients() 应用处理过的渐变。

这里是一个mnist数据的简单例子。注释出现在代码中以更好地解释。

代码-

import tensorflow as tf
print(tf.__version__)
from tensorflow import keras
from tensorflow.keras import layers

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Preprocess the data (these are Numpy arrays)
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255

y_train = y_train.astype('float32')
y_test = y_test.astype('float32')

# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

# Get the model.
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)

# Instantiate an optimizer.
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
# Instantiate a loss function.
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Prepare the training dataset.
batch_size = 64
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)

epochs = 3
for epoch in range(epochs):
  print('Start of epoch %d' % (epoch,))

  # Iterate over the batches of the dataset.
  for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):

    # Open a GradientTape to record the operations run
    # during the forward pass, which enables autodifferentiation.
    with tf.GradientTape() as tape:

      # Run the forward pass of the layer.
      # The operations that the layer applies
      # to its inputs are going to be recorded
      # on the GradientTape.
      logits = model(x_batch_train, training=True)  # Logits for this minibatch

      # Compute the loss value for this minibatch.
      loss_value = loss_fn(y_batch_train, logits)

    # Use the gradient tape to automatically retrieve
    # the gradients of the trainable variables with respect to the loss.
    grads = tape.gradient(loss_value, model.trainable_weights)

    # Run one step of gradient descent by updating
    # the value of the variables to minimize the loss.
    optimizer.apply_gradients(zip(grads, model.trainable_weights))

    # Log every 200 batches.
    if step % 200 == 0:
        print('Training loss (for one batch) at step %s: %s' % (step, float(loss_value)))
        print('Seen so far: %s samples' % ((step + 1) * 64))

输出-

2.2.0
Start of epoch 0
Training loss (for one batch) at step 0: 2.323657512664795
Seen so far: 64 samples
Training loss (for one batch) at step 200: 2.3156163692474365
Seen so far: 12864 samples
Training loss (for one batch) at step 400: 2.2302279472351074
Seen so far: 25664 samples
Training loss (for one batch) at step 600: 2.131979465484619
Seen so far: 38464 samples
Start of epoch 1
Training loss (for one batch) at step 0: 2.00234317779541
Seen so far: 64 samples
Training loss (for one batch) at step 200: 1.7992427349090576
Seen so far: 12864 samples
Training loss (for one batch) at step 400: 1.8583933115005493
Seen so far: 25664 samples
Training loss (for one batch) at step 600: 1.6005337238311768
Seen so far: 38464 samples
Start of epoch 2
Training loss (for one batch) at step 0: 1.6701987981796265
Seen so far: 64 samples
Training loss (for one batch) at step 200: 1.6237502098083496
Seen so far: 12864 samples
Training loss (for one batch) at step 400: 1.3603084087371826
Seen so far: 25664 samples
Training loss (for one batch) at step 600: 1.246948480606079
Seen so far: 38464 samples

您可以找到更多关于 tf.GradientTape here. The example used here is taken from here。

希望这能回答您的问题。快乐学习。

Why am I getting "ValueError: No gradients provided for any variable: ['Variable:0']." error?

Why am I getting "ValueError: No gradients provided for any variable: ['Variable:0']." error?

deep-learning

conv-neural-network

keras

tensorflow

vgg-net