Keras vgg16:矩阵大小不兼容:In[0]:[16,18432],In[1]:[25088,4096]

Keras vgg16: Matrix size-incompatible: In[0]: [16,18432], In[1]: [25088,4096]

我正在使用带有 tensorflow 后端的 keras 运行 基于 vgg16 网络的分类模型。训练开始时,出现以下错误: 这是我认为重要的跟踪部分:

Matrix size-incompatible: In[0]: [16,18432], In[1]: [25088,4096]

我相信它发生在这两层之间:

flatten_1 (Flatten)              (None, 25088)         0           maxpooling2d_5[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 4096)          102764544   flatten_1[0][0]
__________________________________________________________________________________________________

这里是完整的错误跟踪:

Traceback (most recent call last):
  File "L1.py", line 56, in <module>
    vgg.fit(batches, valid_batches, nb_epoch=1)
  File "/vgg16.py", line 220, in fit
    validation_data=val_batches, nb_val_samples=val_batches.nb_sample)
  File "..\keras\models.py", line 935, in fit_generator
    initial_epoch=initial_epoch)
  File "..\keras\engine\training.py", line 1557, in fit_generator
    class_weight=class_weight)
  File "..\keras\engine\training.py", line 1320, in train_on_batch
    outputs = self.train_function(ins)
  File "..\keras\backend\tensorflow_backend.py", line 1943, in __ca
    feed_dict=feed_dict)
  File "..\tensorflow\python\client\session.py", line 767, in run
    run_metadata_ptr)
  File "..\tensorflow\python\client\session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "..\tensorflow\python\client\session.py", line 1015, in _do_
    target_list, options, run_metadata)
  File "..\tensorflow\python\client\session.py", line 1035, in _do_
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [16,18432], In[1]: [25088,4096]
         [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](Re]
         [[Node: mul_2/_231 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:lopu:0", send_device_incarnation=1, tensor_name="edge_667_mul_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]

Caused by op 'MatMul', defined at:
  File "L1.py", line 51, in <module>
    vgg = Vgg16()
  File "/vgg16.py", line 47, in __init__
    self.create()
  File "/vgg16.py", line 139, in create
    self.FCBlock()
  File "/vgg16.py", line 113, in FCBlock
    model.add(Dense(4096, activation='relu'))
  File "..\keras\models.py", line 332, in add
    output_tensor = layer(self.outputs[0])
  File "..\keras\engine\topology.py", line 572, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "..\keras\engine\topology.py", line 635, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "..\keras\engine\topology.py", line 166, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "..\keras\layers\core.py", line 814, in call
    output = K.dot(x, self.W)
  File "..\keras\backend\tensorflow_backend.py", line 827, in dot
    out = tf.matmul(x, y)
  File "..\tensorflow\python\ops\math_ops.py", line 1765, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "..\tensorflow\python\ops\gen_math_ops.py", line 1454, in _m
    transpose_b=transpose_b, name=name)
  File "..\tensorflow\python\framework\op_def_library.py", line 763
    op_def=op_def)
  File "..\tensorflow\python\framework\ops.py", line 2327, in creat
    original_op=self._default_original_op, op_def=op_def)
  File "..\tensorflow\python\framework\ops.py", line 1226, in __ini
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Matrix size-incompatible: In[0]: [16,18432], In[1]: [25088,4096]
         [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](Re]
         [[Node: mul_2/_231 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:lopu:0", send_device_incarnation=1, tensor_name="edge_667_mul_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]

这里是完整的模型摘要:

    ____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
lambda_1 (Lambda)                (None, 3, 226, 226)   0           lambda_input_1[0][0]
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D)  (None, 64, 224, 224)  1792        lambda_1[0][0]
____________________________________________________________________________________________________
zeropadding2d_1 (ZeroPadding2D)  (None, 64, 226, 226)  0           convolution2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 64, 224, 224)  36928       zeropadding2d_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 64, 112, 112)  0           convolution2d_2[0][0]
____________________________________________________________________________________________________
zeropadding2d_2 (ZeroPadding2D)  (None, 64, 114, 114)  0           maxpooling2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D)  (None, 128, 112, 112) 73856       zeropadding2d_2[0][0]
____________________________________________________________________________________________________
zeropadding2d_3 (ZeroPadding2D)  (None, 128, 114, 114) 0           convolution2d_3[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D)  (None, 128, 112, 112) 147584      zeropadding2d_3[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D)    (None, 128, 56, 56)   0           convolution2d_4[0][0]
____________________________________________________________________________________________________
zeropadding2d_4 (ZeroPadding2D)  (None, 128, 58, 58)   0           maxpooling2d_2[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D)  (None, 256, 56, 56)   295168      zeropadding2d_4[0][0]
____________________________________________________________________________________________________
zeropadding2d_5 (ZeroPadding2D)  (None, 256, 58, 58)   0           convolution2d_5[0][0]
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D)  (None, 256, 56, 56)   590080      zeropadding2d_5[0][0]
____________________________________________________________________________________________________
zeropadding2d_6 (ZeroPadding2D)  (None, 256, 58, 58)   0           convolution2d_6[0][0]
____________________________________________________________________________________________________
convolution2d_7 (Convolution2D)  (None, 256, 56, 56)   590080      zeropadding2d_6[0][0]
____________________________________________________________________________________________________
maxpooling2d_3 (MaxPooling2D)    (None, 256, 28, 28)   0           convolution2d_7[0][0]
____________________________________________________________________________________________________
zeropadding2d_7 (ZeroPadding2D)  (None, 256, 30, 30)   0           maxpooling2d_3[0][0]
____________________________________________________________________________________________________
convolution2d_8 (Convolution2D)  (None, 512, 28, 28)   1180160     zeropadding2d_7[0][0]
____________________________________________________________________________________________________
zeropadding2d_8 (ZeroPadding2D)  (None, 512, 30, 30)   0           convolution2d_8[0][0]
____________________________________________________________________________________________________
convolution2d_9 (Convolution2D)  (None, 512, 28, 28)   2359808     zeropadding2d_8[0][0]
____________________________________________________________________________________________________
zeropadding2d_9 (ZeroPadding2D)  (None, 512, 30, 30)   0           convolution2d_9[0][0]
____________________________________________________________________________________________________
convolution2d_10 (Convolution2D) (None, 512, 28, 28)   2359808     zeropadding2d_9[0][0]
____________________________________________________________________________________________________
maxpooling2d_4 (MaxPooling2D)    (None, 512, 14, 14)   0           convolution2d_10[0][0]
____________________________________________________________________________________________________
zeropadding2d_10 (ZeroPadding2D) (None, 512, 16, 16)   0           maxpooling2d_4[0][0]
____________________________________________________________________________________________________
convolution2d_11 (Convolution2D) (None, 512, 14, 14)   2359808     zeropadding2d_10[0][0]
____________________________________________________________________________________________________
zeropadding2d_11 (ZeroPadding2D) (None, 512, 16, 16)   0           convolution2d_11[0][0]
____________________________________________________________________________________________________
convolution2d_12 (Convolution2D) (None, 512, 14, 14)   2359808     zeropadding2d_11[0][0]
____________________________________________________________________________________________________
zeropadding2d_12 (ZeroPadding2D) (None, 512, 16, 16)   0           convolution2d_12[0][0]
____________________________________________________________________________________________________
convolution2d_13 (Convolution2D) (None, 512, 14, 14)   2359808     zeropadding2d_12[0][0]
____________________________________________________________________________________________________
maxpooling2d_5 (MaxPooling2D)    (None, 512, 7, 7)     0           convolution2d_13[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 25088)         0           maxpooling2d_5[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 4096)          102764544   flatten_1[0][0]
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 4096)          0           dense_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 4096)          16781312    dropout_1[0][0]
____________________________________________________________________________________________________
dropout_2 (Dropout)              (None, 4096)          0           dense_2[0][0]
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 1000)          4097000     dropout_2[0][0]
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________

我知道有些矩阵的大小与矩阵点乘法不兼容,但我该如何解决这个问题?

图像通道的顺序似乎是为 Theano 后端而不是 Tensorflow 设置的。 model.summary 中的正确排序应如下所示:

input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928    

…

block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544  

在最后一个 MaxPool 中你有 512, 7, 7 但它应该是 7, 7, 512.

通过在编译模型之前添加以下代码行,确保您使用的是 Tensorflow 而不是 Theano 的排序:

from keras import backend as K    
K.set_image_dim_ordering('tf')  

您也可以在此处仔细检查 Keras 配置文件:~/.keras/keras.json

其中应该包含这个:

{
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow",
"image_dim_ordering": "tf"
}