有没有办法改善val_acc?
Is there a way to improve val_acc?
上下文:
我正在尝试在 kaggle cell dataset 上训练图像分类器,希望达到 0.95 val_acc。我已经尝试了许多模型架构和时期数,以及其他几个超参数,得出了一个有希望的集合,产生了 0.9 val_acc.
我试过的东西:
- 打乱图像标签对,因此正确的标签与图像保持一致
- 规范化图像,使每个像素介于 0 和 1 之间
- 添加了
BatchNormalization()
、Dropout()
以减少过度拟合(现在模型欠拟合)
- permutations of hyperparameters tried
问题:
给出最佳 val_acc 的一组超参数仍然稳定在 0.9。我尝试了很多排列,我有什么地方 missing/doing 错了吗?
型号:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 120, 160, 8) 224
_________________________________________________________________
batch_normalization (BatchNo (None, 120, 160, 8) 32
_________________________________________________________________
activation (Activation) (None, 120, 160, 8) 0
_________________________________________________________________
dropout (Dropout) (None, 120, 160, 8) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 60, 80, 8) 584
_________________________________________________________________
batch_normalization_1 (Batch (None, 60, 80, 8) 32
_________________________________________________________________
activation_1 (Activation) (None, 60, 80, 8) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 60, 80, 8) 584
_________________________________________________________________
batch_normalization_2 (Batch (None, 60, 80, 8) 32
_________________________________________________________________
activation_2 (Activation) (None, 60, 80, 8) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 60, 80, 8) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 30, 40, 8) 584
_________________________________________________________________
batch_normalization_3 (Batch (None, 30, 40, 8) 32
_________________________________________________________________
activation_3 (Activation) (None, 30, 40, 8) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 30, 40, 8) 584
_________________________________________________________________
batch_normalization_4 (Batch (None, 30, 40, 8) 32
_________________________________________________________________
activation_4 (Activation) (None, 30, 40, 8) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 30, 40, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 15, 20, 8) 584
_________________________________________________________________
batch_normalization_5 (Batch (None, 15, 20, 8) 32
_________________________________________________________________
activation_5 (Activation) (None, 15, 20, 8) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 15, 20, 16) 3216
_________________________________________________________________
batch_normalization_6 (Batch (None, 15, 20, 16) 64
_________________________________________________________________
activation_6 (Activation) (None, 15, 20, 16) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 15, 20, 16) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 8, 10, 16) 6416
_________________________________________________________________
batch_normalization_7 (Batch (None, 8, 10, 16) 64
_________________________________________________________________
activation_7 (Activation) (None, 8, 10, 16) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 8, 10, 16) 6416
_________________________________________________________________
batch_normalization_8 (Batch (None, 8, 10, 16) 64
_________________________________________________________________
activation_8 (Activation) (None, 8, 10, 16) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 8, 10, 16) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 4, 5, 16) 6416
_________________________________________________________________
batch_normalization_9 (Batch (None, 4, 5, 16) 64
_________________________________________________________________
activation_9 (Activation) (None, 4, 5, 16) 0
_________________________________________________________________
flatten (Flatten) (None, 320) 0
_________________________________________________________________
dense (Dense) (None, 240) 77040
_________________________________________________________________
batch_normalization_10 (Batc (None, 240) 960
_________________________________________________________________
dropout_5 (Dropout) (None, 240) 0
_________________________________________________________________
dense_1 (Dense) (None, 162) 39042
_________________________________________________________________
batch_normalization_11 (Batc (None, 162) 648
_________________________________________________________________
dropout_6 (Dropout) (None, 162) 0
_________________________________________________________________
dense_2 (Dense) (None, 84) 13692
_________________________________________________________________
batch_normalization_12 (Batc (None, 84) 336
_________________________________________________________________
dropout_7 (Dropout) (None, 84) 0
_________________________________________________________________
dense_3 (Dense) (None, 4) 340
=================================================================
Total params: 158,114
Trainable params: 156,918
Non-trainable params: 1,196
visualization of activations and val_acc, val_loss
注:
优化是使用 talos
完成的,可以在 here. I edited and added some modules here 中找到。
编辑 1:
我用的优化器是Nadam,学习率0.0002。 Full notebook.
TLDR:
使用来自测试 运行 的最佳超参数对 kaggle cell dataset 进行了训练,该测试尝试了大约 200 个不同的超参数。稳定在 0.9。为什么不更高?
据我所知,我使用的学习率太低了。增加它似乎有帮助。
上下文:
我正在尝试在 kaggle cell dataset 上训练图像分类器,希望达到 0.95 val_acc。我已经尝试了许多模型架构和时期数,以及其他几个超参数,得出了一个有希望的集合,产生了 0.9 val_acc.
我试过的东西:
- 打乱图像标签对,因此正确的标签与图像保持一致
- 规范化图像,使每个像素介于 0 和 1 之间
- 添加了
BatchNormalization()
、Dropout()
以减少过度拟合(现在模型欠拟合) - permutations of hyperparameters tried
问题:
给出最佳 val_acc 的一组超参数仍然稳定在 0.9。我尝试了很多排列,我有什么地方 missing/doing 错了吗?
型号:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 120, 160, 8) 224
_________________________________________________________________
batch_normalization (BatchNo (None, 120, 160, 8) 32
_________________________________________________________________
activation (Activation) (None, 120, 160, 8) 0
_________________________________________________________________
dropout (Dropout) (None, 120, 160, 8) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 60, 80, 8) 584
_________________________________________________________________
batch_normalization_1 (Batch (None, 60, 80, 8) 32
_________________________________________________________________
activation_1 (Activation) (None, 60, 80, 8) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 60, 80, 8) 584
_________________________________________________________________
batch_normalization_2 (Batch (None, 60, 80, 8) 32
_________________________________________________________________
activation_2 (Activation) (None, 60, 80, 8) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 60, 80, 8) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 30, 40, 8) 584
_________________________________________________________________
batch_normalization_3 (Batch (None, 30, 40, 8) 32
_________________________________________________________________
activation_3 (Activation) (None, 30, 40, 8) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 30, 40, 8) 584
_________________________________________________________________
batch_normalization_4 (Batch (None, 30, 40, 8) 32
_________________________________________________________________
activation_4 (Activation) (None, 30, 40, 8) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 30, 40, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 15, 20, 8) 584
_________________________________________________________________
batch_normalization_5 (Batch (None, 15, 20, 8) 32
_________________________________________________________________
activation_5 (Activation) (None, 15, 20, 8) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 15, 20, 16) 3216
_________________________________________________________________
batch_normalization_6 (Batch (None, 15, 20, 16) 64
_________________________________________________________________
activation_6 (Activation) (None, 15, 20, 16) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 15, 20, 16) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 8, 10, 16) 6416
_________________________________________________________________
batch_normalization_7 (Batch (None, 8, 10, 16) 64
_________________________________________________________________
activation_7 (Activation) (None, 8, 10, 16) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 8, 10, 16) 6416
_________________________________________________________________
batch_normalization_8 (Batch (None, 8, 10, 16) 64
_________________________________________________________________
activation_8 (Activation) (None, 8, 10, 16) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 8, 10, 16) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 4, 5, 16) 6416
_________________________________________________________________
batch_normalization_9 (Batch (None, 4, 5, 16) 64
_________________________________________________________________
activation_9 (Activation) (None, 4, 5, 16) 0
_________________________________________________________________
flatten (Flatten) (None, 320) 0
_________________________________________________________________
dense (Dense) (None, 240) 77040
_________________________________________________________________
batch_normalization_10 (Batc (None, 240) 960
_________________________________________________________________
dropout_5 (Dropout) (None, 240) 0
_________________________________________________________________
dense_1 (Dense) (None, 162) 39042
_________________________________________________________________
batch_normalization_11 (Batc (None, 162) 648
_________________________________________________________________
dropout_6 (Dropout) (None, 162) 0
_________________________________________________________________
dense_2 (Dense) (None, 84) 13692
_________________________________________________________________
batch_normalization_12 (Batc (None, 84) 336
_________________________________________________________________
dropout_7 (Dropout) (None, 84) 0
_________________________________________________________________
dense_3 (Dense) (None, 4) 340
=================================================================
Total params: 158,114
Trainable params: 156,918
Non-trainable params: 1,196
visualization of activations and val_acc, val_loss
注:
优化是使用 talos
完成的,可以在 here. I edited and added some modules here 中找到。
编辑 1:
我用的优化器是Nadam,学习率0.0002。 Full notebook.
TLDR:
使用来自测试 运行 的最佳超参数对 kaggle cell dataset 进行了训练,该测试尝试了大约 200 个不同的超参数。稳定在 0.9。为什么不更高?
据我所知,我使用的学习率太低了。增加它似乎有帮助。