卷积神经网络架构——正确吗？

Question

我正在尝试训练卷积神经网络。因此，我使用了 646 images/license 个板的数据集，其中包含 8 个字符（0-9，A-Z；没有字母 'O' 和空格，总共 36 个可能的字符）。这些是我的训练数据 X_train。它们的形状是 (646, 40, 200, 3)，颜色代码为 3。我将它们调整为相同的形状。

我还有一个数据集，其中包含此图像的标签，我将其单热编码为形状 (646, 8, 36) 的 numpy 数组。这个数据是我的y_train数据。

现在，我正在尝试应用如下所示的神经网络：架构取自这篇论文：https://ieeexplore.ieee.org/abstract/document/8078501

我排除了批量归一化部分，因为这部分对我来说不是最有趣的。但是我对层的顶部非常不确定。这意味着最后一个池化层之后以 model.add(Flatten())...

开头的部分

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), input_shape = (40, 200, 3), activation = "relu"))
model.add(Conv2D(32, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(32, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(64, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(64, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(16000, activation = "relu"))
model.add(Dense(128, activation = "relu"))
model.add(Dense(36, activation = "relu"))
model.add(Dense(8*36, activation="Softmax"))
model.add(keras.layers.Reshape((8, 36)))

非常感谢您！

Answer 1

假设下图与您的模型架构相匹配，代码可用于创建模型。确保输入图像有一些填充。

import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Input, Reshape, Concatenate

def create_model(input_shape = (40, 200, 3)):
    input_img = Input(shape=input_shape)
    model = Conv2D(32, kernel_size=(3, 3), input_shape = (40, 200, 3), activation = "relu")(input_img)
    model = Conv2D(32, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = Conv2D(32, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = MaxPooling2D(pool_size=(2, 2))(model)
    model = Conv2D(64, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = Conv2D(64, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = Conv2D(64, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = MaxPooling2D(pool_size=(2, 2))(model)
    model = Conv2D(128, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = Conv2D(128, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = Conv2D(128, kernel_size=(3, 3), padding="same", activation = "relu")(model)
    model = MaxPooling2D(pool_size=(2, 2))(model)
    backbone = Flatten()(model)

    branches = []
    for i in range(8):
        branches.append(backbone)
        branches[i] = Dense(16000, activation = "relu", name="branch_"+str(i)+"_Dense_16000")(branches[i])
        branches[i] = Dense(128, activation = "relu", name="branch_"+str(i)+"_Dense_128")(branches[i])
        branches[i] = Dense(36, activation = "softmax", name="branch_"+str(i)+"_output")(branches[i])
    
    output = Concatenate(axis=1)(branches)
    output = Reshape((8, 36))(output)
    model = Model(input_img, output)

    return model

卷积神经网络架构——正确吗？

Convolutional Neural Net Architecture - correct?

python

architecture

conv-neural-network

valueerror