将预处理层添加到顺序模型时的 Keras 错误

Question

我创建了一个顺序预处理层模型，如下所示：

import tensorflow.keras as keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dropout, RandomRotation
from tensorflow.keras.utils import set_random_seed; set_random_seed(72)
import matplotlib.pyplot as plt

(ax, ay), (qx, qy) = cifar10.load_data()
ay = keras.utils.to_categorical(ay, 10)
qy = keras.utils.to_categorical(qy, 10)
ax = ax.astype('float32'); ax /= 255;
qx = qx.astype('float32'); qx /= 255;

DA = Sequential([RandomRotation(180/360,fill_mode="nearest",interpolation="bilinear", input_shape=(32, 32, 3))])

然后我使用以下方法打印了第一张图片的输出：

X=ax[0:1,:,:,:]
plt.imshow(X[0])
plt.show()

transformedX=DA(X).numpy()
plt.imshow(transformedX[0,:,:,:])
plt.show()

结果：

这是预期的输出。网络为图像添加了随机旋转。

然后，我将预处理模型添加到另一个序列模型，只包含它和一个 Dropout 层。

model = Sequential()
model.add(DA)
model.add(Dropout(0.25))

最后，我完全没有使用新模型:

，以与之前相同的方式再次打印图像
X=ax[0:1,:,:,:] plt.imshow(X[0]) plt.show() transformedX=DA(X).numpy() plt.imshow(transformedX[0,:,:,:]) plt.show()

结果：

我在本地（在 Spyder 中）和使用 Google Colab 都得到了这个结果。 Here 笔记本，如果你想试试。

从这里开始，每隔一次运行程序，每张图片都会看起来像原始图片。要再次得到这个结果，我需要重新启动运行时（在 Google Colab 中），%reset 似乎在本地不起作用。

如果我从预处理层中删除 input_shape=(32, 32, 3) 行，则不会出现该问题。但是，我的印象是有必要将其包含在模型的第一层中。

这是真正的错误还是我的代码中的问题？

如果是 bug，是不是某个过时版本的 Keras 或 Tensorflow 的问题？

Answer 1

我建议您看看这个 post：。最好将此 DA 用作层或预处理并取出 Y 而不是层本身。因此，例如 DA(X) 是一次性的运行然后您将输出（而不是 DA(X) 本身）用于您的模型（X）。

喜欢：

model = DA
model.add(dropout=0.25)

或大致

y = DA(X)  
z = model(y)

Answer 2

这样做有三个原因。与

有关

TF 如何处理传递给图层调用的 training 参数（或缺少参数）
Dropout层如何处理training=None
TF 如何构建 Sequential 模型

注意我的回答是基于 TF v2.9.1.

`training` 参数

某些层，例如 Dropout 或 RandomRotation，在训练和推理期间表现不同。这就是为什么 at their core，层总是尝试识别是否在训练期间调用它们，或者每当它们通过 () 调用时（__call__ 的语法糖）。在内部，training 标志设置为，优先顺序，

training argument with non-None value explicitly passed to the layer call e.g., when you call the layer as layer(inputs, training=True/False)

training argument determined by this very same 4-check procedure for its parent layer in a layer call chain.

learning_phase variable of the backend if that variable has been set. Checking the variable's state is done by keras.backend.global_learning_phase_is_set() and getting its value is done by keras.backend.learning_phase().

Default value of training argument in this layer call signature. Note that call ≠ __call__. The former is a TF-defined method and the latter is one of many built-in magic methods in Python, although __call__ implementation of base layer eventually invokes call at some point.

如果 4 次检查中的 none 产生 non-None 值，则使用 training=None。

RandomRotation 层仅在看到 training=True 时才旋转图像。您对其的调用未通过前三项检查，但由于 training 在其 call signature 中默认为 True，因此通过了最后一次检查。因此，该层看到 training=True 并按预期运行。但是，一旦您添加了 Dropout，一切都变糟了，这是怎么回事？

辍学和`training=None`

事实证明，使用最终 training=None 调用 Dropout 实际上可以设置 learning_phase 变量 的状态（但不是值）。这很容易发生，因为与 RandomRotation 不同，Dropout 具有默认值 training=None，它不为检查 4.

提供保护

>>> keras.backend.global_learning_phase_is_set()
False
>>> _ = tf.keras.layers.Dropout(.25)([1,2,3])
>>> keras.backend.global_learning_phase_is_set()
True

一旦发生这种情况，对 any 层的所有后续调用基本上都会忽略检查 4：他们将始终看到 learning_phase 已设置并使用 training=learning_phase（默认为 0）每当到达检查 3 时。您后来对 RandomRotation 的调用成为此问题的牺牲品，导致该层认为它是在推理期间调用的，因此返回了输入 as-is.

更准确地说，Dropout won't accept None for training 并且会直接获取 learning_phase 而不管它的状态，通过调用 learning_phase() 而不先检查是否 global_learning_phase_is_set()。这个未经检查的 learning_phase() 调用将在进程中设置 learning_phase 的状态。

>>> keras.backend.global_learning_phase_is_set()
False
>>> keras.backend.learning_phase()
0
>>> keras.backend.global_learning_phase_is_set()
True

但是我没有调用Dropout？

这是最后一部分，这是关于 Sequential 将层添加到其堆栈的方式。当您添加不是 keras tensor 但具有已知输入形状的第一层时，顺序将创建一个具有完全相同形状的输入 keras 张量，并立即调用该张量上的层以确保输出 keras 张量。这是可能的，因为输入形状是已知的。

>>> Sequential([RandomRotation(0.5)]).outputs is None
True
>>> Sequential([RandomRotation(0.5, input_shape=(2,2,1))]).outputs
[<KerasTensor: shape=(None, 2, 2, 1) dtype=float32 (created by layer 'random_rotation_7')>]

从那里开始，每次添加另一层时，顺序模型都会检查它是否已经有输出 keras 张量（即检查输入形状是否已知）。如果是这样，它将再次立即调用当前输出张量上的新层以获得更新的张量。否则，输入形状未知，模型将输出 keras 张量的构造推迟到稍后在实际输入数据上调用时。

>>> from tensorflow.keras.models import Sequential
>>> from tensorflow.keras.layers import RandomRotation, Dropout
>>> class DropoutWithCount(Dropout):
...     def __init__(self, rate, noise_shape=None, seed=None, **kwargs):
...         super().__init__(rate, noise_shape, seed, **kwargs)
...         self.count = 0
...
...     def call(self, inputs, training=None):
...         self.count += 1
...         print(f"Dropout called with training={training}, call counts = {self.count}")
...         return super().call(inputs, training)
...
>>> m = Sequential([RandomRotation(0.5, input_shape=(2,2,1)), DropoutWithCount(.25)])
Dropout called with training=None, call counts = 1
>>> m = Sequential([RandomRotation(0.5, input_shape=(2,2,1))])
>>> m1 = Sequential()
>>> m1.add(m)
>>> m1.add(DropoutWithCount(.25))
Dropout called with training=None, call counts = 1
>>> m = Sequential([RandomRotation(0.5), DropoutWithCount(.25)])
>>>

所以是的，因为输入形状是已知的，Dropout 层将在没有任何 training 参数的情况下自动调用 一旦它被添加 到顺序，这因此设置 learning_phase.

的状态

我该怎么办？

始终将 training 参数正确传递给您的 model/layer 调用，因为显式参数检查的优先级最高。否则，不要将 training 传递给任何调用，而是通过 keras.backend.set_learning_phase(True/False) 将全局值 learning_phase 设置为 True 或 False，因为这将优先超过层的默认 training 值。

>>> from tensorflow.keras.models import Sequential
>>> from tensorflow.keras.layers import RandomRotation, Dropout
>>> import keras as keras
>>> import numpy as np
>>> img = np.array([[[[1],[2]],[[3],[4]]]])
>>> m = Sequential([RandomRotation(0.5, input_shape=(2,2,1))])
>>> m(img)
<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[1.6862597],
         [3.3725195]],

        [[1.6274806],
         [3.3137403]]]], dtype=float32)>
>>> m1 = Sequential()
>>> m1.add(m)
>>> m1.add(Dropout(.25))
>>> m(img)
<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[1.],
         [2.]],

        [[3.],
         [4.]]]], dtype=float32)>
>>> m(img, training=True)
<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[1.8427435],
         [3.685487 ]],

        [[1.314513 ],
         [3.1572566]]]], dtype=float32)>
>>> keras.backend.set_learning_phase(True)
>>> m(img)
<tf.Tensor: shape=(1, 2, 2, 1), dtype=float32, numpy=
array([[[[3.3871531],
         [3.3064234]],

        [[1.6935766],
         [1.612847 ]]]], dtype=float32)>

将预处理层添加到顺序模型时的 Keras 错误

Keras bug when adding preprocessing layer to sequential model

python

keras

tensorflow

google-colaboratory

`training` 参数

辍学和`training=None`

但是我没有调用Dropout？

我该怎么办？

将预处理层添加到顺序模型时的 Keras 错误

Keras bug when adding preprocessing layer to sequential model

python

keras

tensorflow

google-colaboratory

training 参数

辍学和training=None

但是我没有调用Dropout？

我该怎么办？

`training` 参数

辍学和`training=None`