使用 vgg16 为自己的数据集进行边界框预测
using vgg16 for bounding box prediction for own dataset
构建基于 vgg16 的 classifier 之后。我想构建一个边界框来绑定检测到的对象。
我在互联网上发现我可以通过删除最后一个 Maxpool
之后的层并添加一些 fully connected layer
来做到这一点
flatten = vgg16.output
flatten = Flatten()(flatten)
bboxhead = Dense(128,activation="relu")(flatten)
bboxhead = Dense(64,activation="relu")(bboxhead)
bboxhead = Dense(32,activation="relu")(bboxhead)
bboxhead = Dense(4,activation="relu")(bboxhead)
box_model = Model(inputs = vgg16.input,outputs = bboxhead)
box_model.summary()
模型应该是这样的,和我搜索的一样
Model: "box_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
flatten (Flatten) (None, 25088) 0
dense (Dense) (None, 128) 3211392
dense_1 (Dense) (None, 64) 8256
dense_2 (Dense) (None, 32) 2080
dense_3 (Dense) (None, 4) 132
=================================================================
Total params: 17,936,548
Trainable params: 3,221,860
Non-trainable params: 14,714,688
_________________________________________________________________
然后训练模型
from tensorflow.keras.optimizers import Adam
opt = Adam(1e-4)
box_model.compile(loss='mse',optimizer=opt)
steps, val_steps = train_gen.n/batch_size, val_gen.n/batch_size
num_epochs = 30
history = box_model.fit(train_gen,validation_data=val_gen,batch_size=32,epochs=30,verbose=1)
但是我发现最后的Dense
层有4个dim,和我的class(5)个数不符。在我将 dim 更改为 5 之后。它有效,但我无法训练任何东西。输出的 5 值数组不合理(全为 0)。
还是我的实现不正确?
简而言之:你的实现没问题,但是你的数据有误。
为了训练新的输出,您需要新的标签。输入不需要改变,但您需要以某种方式获取您尝试检测的边界框的 x、y、高度和宽度。如果数据集没有提供,就需要自己标注了。
如果您想在边界框坐标上进行训练,您的标签需要是边界框坐标。您无法继续使用数据集的 class 标签进行训练。无论您的模型试图在监督学习中学习什么,这就是您需要作为标签提供的内容。
构建基于 vgg16 的 classifier 之后。我想构建一个边界框来绑定检测到的对象。
我在互联网上发现我可以通过删除最后一个 Maxpool
之后的层并添加一些 fully connected layer
flatten = vgg16.output
flatten = Flatten()(flatten)
bboxhead = Dense(128,activation="relu")(flatten)
bboxhead = Dense(64,activation="relu")(bboxhead)
bboxhead = Dense(32,activation="relu")(bboxhead)
bboxhead = Dense(4,activation="relu")(bboxhead)
box_model = Model(inputs = vgg16.input,outputs = bboxhead)
box_model.summary()
模型应该是这样的,和我搜索的一样
Model: "box_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
flatten (Flatten) (None, 25088) 0
dense (Dense) (None, 128) 3211392
dense_1 (Dense) (None, 64) 8256
dense_2 (Dense) (None, 32) 2080
dense_3 (Dense) (None, 4) 132
=================================================================
Total params: 17,936,548
Trainable params: 3,221,860
Non-trainable params: 14,714,688
_________________________________________________________________
然后训练模型
from tensorflow.keras.optimizers import Adam
opt = Adam(1e-4)
box_model.compile(loss='mse',optimizer=opt)
steps, val_steps = train_gen.n/batch_size, val_gen.n/batch_size
num_epochs = 30
history = box_model.fit(train_gen,validation_data=val_gen,batch_size=32,epochs=30,verbose=1)
但是我发现最后的Dense
层有4个dim,和我的class(5)个数不符。在我将 dim 更改为 5 之后。它有效,但我无法训练任何东西。输出的 5 值数组不合理(全为 0)。
还是我的实现不正确?
简而言之:你的实现没问题,但是你的数据有误。
为了训练新的输出,您需要新的标签。输入不需要改变,但您需要以某种方式获取您尝试检测的边界框的 x、y、高度和宽度。如果数据集没有提供,就需要自己标注了。
如果您想在边界框坐标上进行训练,您的标签需要是边界框坐标。您无法继续使用数据集的 class 标签进行训练。无论您的模型试图在监督学习中学习什么,这就是您需要作为标签提供的内容。