实施 GradCam 时渐变为 None

Question

我正在尝试为 tensorflow 2.0 模型（使用 keras-api 创建）实现 grad-cam，但是从磁带返回的梯度总是 None.

我正在按照 https://keras.io/examples/vision/grad_cam/ 上给出的示例进行操作。

我的模型相当简单，但为了调试，我将其换成了 tf.keras.applications 提供的内置 Xception 模型（行为没有差异，所以问题一定出在我的代码上）。

    # model (not shown here) is Xception from tf.keras.applications
    cam_model = tf.keras.models.Model(
        [model.inputs],
        [model.get_layer(conv_layer_name).output, model.output] # conv_layer_name = 'block14_sepconv2_act'
    )

    with tf.GradientTape() as tape:
        conv_out, predictions = cam_model(image)
        class_out = tf.argmax(predictions, axis=-1)

    grads = tape.gradient(class_out, conv_out)

    if grads is None: # grads is None
        raise Exception("Grad cam has recorded no gradient")

这很简单，我看不出为什么梯度是None。我怀疑磁带可能没有在录制，但鉴于 https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/vision/ipynb/grad_cam.ipynb 中的 colab 似乎不需要任何内容。

有一个相关的问题，但是卷积层不正确，而这里确实是正确的层。

编辑

所以 argmax 在 Xception 的情况下是有问题的，但是解决这个问题（例如直接使用预测）对我的模型不起作用。这是模型定义代码：

    backbone = VGG16(
        include_top=False,
        weights='imagenet',
        input_shape=(*size, 3),
        pooling='max'
    )

    backbone.trainable = False

    net = Sequential()

    for layer in backbone.layers:
        net.add(layer)

    net.add(Flatten())
    net.add(Dense(256, activation='relu'))
    net.add(Dense(128, activation='relu'))
    net.add(Dense(len(CLASSES), activation='softmax'))

这是在 GPU 上的 tensorflow 2.8.0 中。

Answer 1

就像@AloneTogether提到的，argmax的结果是不可微分的，因此应用tape.gradients(...)后的None结果是正常的，因为无法计算梯度。

虽然无法区分argmax的结果，但可以通过以下方式select正确激活：

        class_pred = tf.argmax(predictions, axis=-1)
        class_out = predictions[:, class_pred]

这解决了这个问题（通过使用 Xception）。

使用我的完整模型的另一个问题是在尝试访问 VGG16 的内层时出现断开连接的图形错误。通过使用 VGG16 的输入作为模型的第一层并将 VGG16 的输出用作下一个可用层（使用函数 API），我能够以一种不太令人满意的方式解决此问题：

    x = vgg16.output
    x = Flatten()(x)
    ...
    return Model(vgg16.input, x)

网络图将完全展开，这意味着您将没有“vgg”块，而是展开 VGG 的所有层。我认为有一个 non-unrolled 版本是可能的，但我无法实现它。这个答案暗示这是可能的：

实施 GradCam 时渐变为 None

Gradients are None while implementing GradCam

python

keras

tensorflow

tensorflow2.0