未提供梯度/'NoneType' 对象在尝试拟合多输出模型时不可调用

No gradients are provided / 'NoneType' object is not callable when trying to fit a multi-output model

我是机器学习的新手,在这个问题上遇到了很多困难。我正在使用带有 tensorflow 版本 2.3.1 的 Kaggle 笔记本。我需要用面部图像训练模型并预测多个属性、皱纹、雀斑、头发颜色等、头发厚度和眼镜,因此是多输出模型。当我尝试 model.fit 时,我首先收到“未提供渐变”的错误消息。在 运行 上,没有任何更改的相同代码给我错误“NoneType 对象不可调用”。我被困在这里一个多星期了,到目前为止,互联网上还没有解决方案能够解决这个问题,所以我在这里包含了尽可能多的细节。关于该问题的一些辅助信息,皱纹和雀斑的值为 0 或 1,而其他输出的值范围为 0 到 3、0 到 5 或 0 到 9。这是代码。

正在设置 CNN:

IMAGE_SIZE = 100

def base_hidden_layers(input_shape):
    model = Conv2D(64, kernel_size = (3, 3), padding = 'same', activation= 'relu')(input_shape)
    model = BatchNormalization(axis=-1)(model)
    model = MaxPooling2D(pool_size=(2, 2), strides=(2, 2))(model)
    model = Dropout(0.25)(model)
    
    model = Conv2D(128, kernel_size = (3, 3), padding = 'same', activation= 'relu')(model)
    model = BatchNormalization(axis=-1)(model)
    model = MaxPooling2D(pool_size=(2, 2), strides=(2, 2))(model)
    model = Dropout(0.25)(model)
    
    model = Conv2D(128, kernel_size = (3, 3), padding = 'same', activation= 'relu')(model)
    model = BatchNormalization(axis=-1)(model)
    model = MaxPooling2D(pool_size=(2, 2), strides=(2, 2))(model)
    model = Dropout(0.25)(model)
    
    return model

def wrinkle_layers(input_shape):
    model = base_hidden_layers(input_shape)
    
    model = Flatten()(model)
    model = Dense(1024, activation = 'relu')(model)
    model = BatchNormalization()(model)
    model = Dropout(0.5)(model)
    model = Dense(2, activation = 'sigmoid', name = 'wrinkles')(model)
    
    return model
    
def freckles_layers(input_shape):
    model = base_hidden_layers(input_shape)
    
    model = Flatten()(model)
    model = Dense(1024, activation = 'relu')(model)
    model = BatchNormalization()(model)
    model = Dropout(0.5)(model)
    model = Dense(2, activation = 'sigmoid', name = 'freckles')(model)
    
    return model

def glasses_layers(input_shape):
    model = base_hidden_layers(input_shape)
    
    model = Flatten()(model)
    model = Dense(1024, activation = 'relu')(model)
    model = BatchNormalization()(model)
    model = Dropout(0.5)(model)
    model = Dense(3, activation = 'softmax', name = 'glasses')(model)
    
    return model

def hair_color_layers(input_shape):
    model = base_hidden_layers(input_shape)
    
    model = Flatten()(model)
    model = Dense(1024, activation = 'relu')(model)
    model = BatchNormalization()(model)
    model = Dropout(0.5)(model)
    model = Dense(9, activation = 'softmax', name = 'hair_color')(model)
    
    return model

def hair_top_layers(input_shape):
    model = base_hidden_layers(input_shape)
    
    model = Flatten()(model)
    model = Dense(1024, activation = 'relu')(model)
    model = BatchNormalization()(model)
    model = Dropout(0.5)(model)
    model = Dense(4, activation = 'softmax', name = 'hair_top')(model)
    
    return model

def neural_net_model(image_size):
    shape = (image_size, image_size, 3)
    shape = Input(shape=shape)
    
    wrinkle_branch = wrinkle_layers(shape)
    freckles_branch = freckles_layers(shape)
    glasses_branch = glasses_layers(shape)
    hair_col_branch = hair_color_layers(shape)
    hair_top_branch = hair_top_layers(shape)
    
    model = Model(inputs=shape, outputs= [wrinkle_branch, freckles_branch, glasses_branch, hair_col_branch, hair_top_branch], name='face_net')
    
    return model

model = neural_net_model(IMAGE_SIZE)

编译模型,设置损失函数:

model.compile(optimizer=Adam(lr=1e-4, decay= 1e-4 / 100),
             loss = {'wrinkles': 'binary_crossentropy',
                    'freckles': 'binary_crossentropy',
                    'glasses': 'categorical_crossentropy',
                    'hair_color': 'categorical_crossentropy',
                    'hair_top': 'categorical_crossentropy'}, metrics=['accuracy'])

model.summary()

model.summary() 的输出:

Model: "face_net"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 100, 100, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 100, 100, 64) 1792        input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 100, 100, 64) 1792        input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 100, 100, 64) 1792        input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 100, 100, 64) 1792        input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 100, 100, 64) 1792        input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 100, 100, 64) 256         conv2d[0][0]                     
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 100, 100, 64) 256         conv2d_3[0][0]                   
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 100, 100, 64) 256         conv2d_6[0][0]                   
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 100, 100, 64) 256         conv2d_9[0][0]                   
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 100, 100, 64) 256         conv2d_12[0][0]                  
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 50, 50, 64)   0           batch_normalization[0][0]        
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 50, 50, 64)   0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
max_pooling2d_6 (MaxPooling2D)  (None, 50, 50, 64)   0           batch_normalization_8[0][0]      
__________________________________________________________________________________________________
max_pooling2d_9 (MaxPooling2D)  (None, 50, 50, 64)   0           batch_normalization_12[0][0]     
__________________________________________________________________________________________________
max_pooling2d_12 (MaxPooling2D) (None, 50, 50, 64)   0           batch_normalization_16[0][0]     
__________________________________________________________________________________________________
dropout (Dropout)               (None, 50, 50, 64)   0           max_pooling2d[0][0]              
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 50, 50, 64)   0           max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
dropout_8 (Dropout)             (None, 50, 50, 64)   0           max_pooling2d_6[0][0]            
__________________________________________________________________________________________________
dropout_12 (Dropout)            (None, 50, 50, 64)   0           max_pooling2d_9[0][0]            
__________________________________________________________________________________________________
dropout_16 (Dropout)            (None, 50, 50, 64)   0           max_pooling2d_12[0][0]           
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 50, 50, 128)  73856       dropout[0][0]                    
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 50, 50, 128)  73856       dropout_4[0][0]                  
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 50, 50, 128)  73856       dropout_8[0][0]                  
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 50, 50, 128)  73856       dropout_12[0][0]                 
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 50, 50, 128)  73856       dropout_16[0][0]                 
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 50, 50, 128)  512         conv2d_1[0][0]                   
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 50, 50, 128)  512         conv2d_4[0][0]                   
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 50, 50, 128)  512         conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 50, 50, 128)  512         conv2d_10[0][0]                  
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 50, 50, 128)  512         conv2d_13[0][0]                  
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 25, 25, 128)  0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 25, 25, 128)  0           batch_normalization_5[0][0]      
__________________________________________________________________________________________________
max_pooling2d_7 (MaxPooling2D)  (None, 25, 25, 128)  0           batch_normalization_9[0][0]      
__________________________________________________________________________________________________
max_pooling2d_10 (MaxPooling2D) (None, 25, 25, 128)  0           batch_normalization_13[0][0]     
__________________________________________________________________________________________________
max_pooling2d_13 (MaxPooling2D) (None, 25, 25, 128)  0           batch_normalization_17[0][0]     
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 25, 25, 128)  0           max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
dropout_5 (Dropout)             (None, 25, 25, 128)  0           max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
dropout_9 (Dropout)             (None, 25, 25, 128)  0           max_pooling2d_7[0][0]            
__________________________________________________________________________________________________
dropout_13 (Dropout)            (None, 25, 25, 128)  0           max_pooling2d_10[0][0]           
__________________________________________________________________________________________________
dropout_17 (Dropout)            (None, 25, 25, 128)  0           max_pooling2d_13[0][0]           
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 25, 25, 128)  147584      dropout_1[0][0]                  
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 25, 25, 128)  147584      dropout_5[0][0]                  
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 25, 25, 128)  147584      dropout_9[0][0]                  
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 25, 25, 128)  147584      dropout_13[0][0]                 
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 25, 25, 128)  147584      dropout_17[0][0]                 
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 25, 25, 128)  512         conv2d_2[0][0]                   
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 25, 25, 128)  512         conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 25, 25, 128)  512         conv2d_8[0][0]                   
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 25, 25, 128)  512         conv2d_11[0][0]                  
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 25, 25, 128)  512         conv2d_14[0][0]                  
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 12, 12, 128)  0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)  (None, 12, 12, 128)  0           batch_normalization_6[0][0]      
__________________________________________________________________________________________________
max_pooling2d_8 (MaxPooling2D)  (None, 12, 12, 128)  0           batch_normalization_10[0][0]     
__________________________________________________________________________________________________
max_pooling2d_11 (MaxPooling2D) (None, 12, 12, 128)  0           batch_normalization_14[0][0]     
__________________________________________________________________________________________________
max_pooling2d_14 (MaxPooling2D) (None, 12, 12, 128)  0           batch_normalization_18[0][0]     
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 12, 12, 128)  0           max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
dropout_6 (Dropout)             (None, 12, 12, 128)  0           max_pooling2d_5[0][0]            
__________________________________________________________________________________________________
dropout_10 (Dropout)            (None, 12, 12, 128)  0           max_pooling2d_8[0][0]            
__________________________________________________________________________________________________
dropout_14 (Dropout)            (None, 12, 12, 128)  0           max_pooling2d_11[0][0]           
__________________________________________________________________________________________________
dropout_18 (Dropout)            (None, 12, 12, 128)  0           max_pooling2d_14[0][0]           
__________________________________________________________________________________________________
flatten (Flatten)               (None, 18432)        0           dropout_2[0][0]                  
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 18432)        0           dropout_6[0][0]                  
__________________________________________________________________________________________________
flatten_2 (Flatten)             (None, 18432)        0           dropout_10[0][0]                 
__________________________________________________________________________________________________
flatten_3 (Flatten)             (None, 18432)        0           dropout_14[0][0]                 
__________________________________________________________________________________________________
flatten_4 (Flatten)             (None, 18432)        0           dropout_18[0][0]                 
__________________________________________________________________________________________________
dense (Dense)                   (None, 1024)         18875392    flatten[0][0]                    
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1024)         18875392    flatten_1[0][0]                  
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 1024)         18875392    flatten_2[0][0]                  
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 1024)         18875392    flatten_3[0][0]                  
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 1024)         18875392    flatten_4[0][0]                  
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 1024)         4096        dense[0][0]                      
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 1024)         4096        dense_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 1024)         4096        dense_2[0][0]                    
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 1024)         4096        dense_3[0][0]                    
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 1024)         4096        dense_4[0][0]                    
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 1024)         0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
dropout_7 (Dropout)             (None, 1024)         0           batch_normalization_7[0][0]      
__________________________________________________________________________________________________
dropout_11 (Dropout)            (None, 1024)         0           batch_normalization_11[0][0]     
__________________________________________________________________________________________________
dropout_15 (Dropout)            (None, 1024)         0           batch_normalization_15[0][0]     
__________________________________________________________________________________________________
dropout_19 (Dropout)            (None, 1024)         0           batch_normalization_19[0][0]     
__________________________________________________________________________________________________
wrinkles (Dense)                (None, 2)            2050        dropout_3[0][0]                  
__________________________________________________________________________________________________
freckles (Dense)                (None, 2)            2050        dropout_7[0][0]                  
__________________________________________________________________________________________________
glasses (Dense)                 (None, 3)            3075        dropout_11[0][0]                 
__________________________________________________________________________________________________
hair_color (Dense)              (None, 9)            9225        dropout_15[0][0]                 
__________________________________________________________________________________________________
hair_top (Dense)                (None, 4)            4100        dropout_19[0][0]                 
==================================================================================================
Total params: 95,540,500
Trainable params: 95,527,060
Non-trainable params: 13,440

训练集和验证集的形状:

print(valid_images.shape)
print(valid_labels.shape)
print(train_images.shape)
print(train_labels.shape)
print(type(valid_labels))

输出:

(319, 100, 100, 3)
(319, 5)
(1272, 100, 100, 3)
(1272, 5)
<class 'numpy.ndarray'>

以上shape for input表示:(图片数量,图片长度,图片宽度,图片深度(RGB))

输出方式:(图像数,输出列数)

打印训练输入和输出:

train_images

输出:

array([[[[0.41176471, 0.49019608, 0.44705882],
         [0.40784314, 0.49411765, 0.44705882],
         [0.41176471, 0.50588235, 0.45490196],
         ...,
         [0.77647059, 0.81568627, 0.81960784],
         [0.7372549 , 0.76862745, 0.78823529],
         [0.57254902, 0.6       , 0.63921569]]]])

train_labels

输出:

array([[0, 0, 0, 2, 2],
       [0, 1, 0, 0, 2],
       [0, 0, 0, 0, 2],
       ...,
       [0, 0, 0, 1, 2],
       [0, 0, 0, 2, 2],
       [1, 0, 0, 2, 2]])

拟合模型:

history = model.fit((train_images, train_labels),
                    epochs = 100, 
                    validation_data=(valid_images, 
                                     valid_labels))

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.show()

test_loss, test_acc = model.evaluate(test_images,  test_labels)

运行model.fit 上的错误堆栈跟踪,这是笔记本启动后的第一次:

Epoch 1/100
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-0dd91e39ee8e> in <module>
      5                     epochs = 100,
      6                     validation_data=(valid_images, 
----> 7                                      valid_labels))
      8 
      9 plt.plot(history.history['accuracy'], label='accuracy')

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
    106   def _method_wrapper(self, *args, **kwargs):
    107     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
--> 108       return method(self, *args, **kwargs)
    109 
    110     # Running inside `run_distribute_coordinator` already.

/opt/conda/
...

ValueError: in user code:

    /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:806 train_function  *
        return ...
    /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:1271 _filter_grads
        ([v.name for _, v in grads_and_vars],))

    ValueError: No gradients provided for any variable: ['conv2d/kernel:0', 'conv2d/bias:0', 'conv2d_3/kernel:0', ... 'hair_top/kernel:0', 'hair_top/bias:0'].

运行 代码再次出错:

Epoch 1/100
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-356b6cd04e18> in <module>
      2                     epochs = 100,
      3                     validation_data=(valid_images, 
----> 4                                      valid_labels))
      5 
      6 plt.plot(history.history['accuracy'], label='accuracy')

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
    106   def _method_wrapper(self, *args, **kwargs):
    107     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
--> 108       return method(self, *args, **kwargs)
    109 
    110     # Running inside `run_distribute_coordinator` already.

    ...
TypeError: 'NoneType' object is not callable

只添加了堆栈跟踪的重要部分,因为添加整个堆栈跟踪会使这个 post 超过 3000 个字符。

显然我遵循的教程不准确。由于我的 NN 有 5 个分支,因为它需要进行 5 次预测,我的 model.fit 函数需要更改,每个 Nn 分支需要映射到相应的标签。

错误代码:

history = model.fit((train_images, train_labels),
                epochs = 100, 
                validation_data=(valid_images, 
                                 valid_labels))

正确代码:

history = model.fit(x=train_images, y={'wrinkles': train_wrinkles, 'freckles': train_freckles, 'glasses': train_glasses,
                                  'hair_color': train_hair_col, 'hair_top': train_hair_top},
                epochs = 20, 
                validation_data=(valid_images, 
                                 {'wrinkles': valid_wrinkles, 'freckles': valid_freckles, 'glasses': valid_glasses,
                                  'hair_color': valid_hair_col, 'hair_top': valid_hair_top}))

在错误代码中,train_labelsvalid_labels包含一个包含所有标签的数组对于 5 个预测。在正确的代码中,每个预测的标签是单独传递的。

此外,这些标签中的每一个都需要在训练前进行单热编码。这就是我对标签进行编码的方式。

from tensorflow.keras.utils import to_categorical
wrinkles = to_categorical(wrinkles, num_classes=2)

请注意,您需要根据 类 的数量在 num_classes 参数中输入不同的数字。