我的 ResNet 的 Stage-1 中的 BatchNorm 层连接到所有其他 BatchNorm 层。为什么?
The BatchNorm Layer in Stage-1 of my ResNet is connected to all the other BatchNorm layers. Why?
这里我给出了我实现的ResNet模型的一些截图。使用 TensorBoard 生成的图形。
是不是tensorflow在后台做了什么优化?
我已经使用 Keras 实现了代码。
模型中有两个块。
IdentityBlock 和 ConvolutionalBlock。
添加这些块的代码导致 Whosebug 出现问题(您的 post 主要是代码)
在 ResNet 函数 (def ResNet) 中,我使用了 BatchNormalization 并将其命名为 'bnl_stg-1',我只向其传递了一个输入 (X)。但由于某种原因,它连接到身份和卷积块中的所有 BatchNorm 层,如图所示。
代码如下:
def ResNet(input_shape, features):
'''
Implements the ResNet50 Model
[Conv2D -> BatchNorm -> ReLU -> MaxPool2D] --> [ConvBlock -> IdentityBlock * 2] --> [ConvBlock -> IdentityBlock * 3] --> [AveragePool2D -> Flatten -> Dense -> Sigmoid]
'''
X_input = Input(input_shape)
X = ZeroPadding2D((3, 3))(X_input)
# Stage 1
X = Conv2D(filters = 64,
kernel_size = (7, 7),
strides = (2, 2),
name = 'cnl_stg-1',
kernel_initializer = 'glorot_uniform')(X)
X = BatchNormalization(axis = 3,
name = 'bnl_stg-1')(X)
X = Activation('relu')(X)
X = MaxPooling2D(pool_size=(3, 3),
strides=(2, 2))(X)
# Stage 2
X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, s = 1)
X = identity_block(X, 3, [64, 64, 256], stage=2, block=1)
X = identity_block(X, 3, [64, 64, 256], stage=2, block=2)
# Stage 3
X = convolutional_block(X, f = 3, filters = [128, 128, 512], stage = 3, s = 2)
X = identity_block(X, 3, [128, 128, 512], stage = 3, block = 1)
X = identity_block(X, 3, [128, 128, 512], stage = 3, block = 2)
X = identity_block(X, 3, [128, 128, 512], stage = 3, block = 3)
#Final Stage
X = AveragePooling2D(pool_size = (2, 2),
strides = (2, 2))(X)
X = Flatten()(X)
X = Dense(features, activation='sigmoid', name='fc' + str(features), kernel_initializer = 'glorot_uniform')(X)
# Create model
model = Model(inputs = X_input, outputs = X, name='ResNet')
return model
Snapshot of the Graph
bnl-stg-1(BatchNormLayer-Stage-1)
你不用担心。 Batch Normalization 行为在训练和学习之间发生变化,因此 Keras 添加了一个布尔变量来控制它(keras_learning_phase 如果我没记错的话)。这就是为什么所有这些层都是相连的。
您可以期待与 Dropout 层类似的行为。
这里我给出了我实现的ResNet模型的一些截图。使用 TensorBoard 生成的图形。
是不是tensorflow在后台做了什么优化?
我已经使用 Keras 实现了代码。
模型中有两个块。 IdentityBlock 和 ConvolutionalBlock。 添加这些块的代码导致 Whosebug 出现问题(您的 post 主要是代码)
在 ResNet 函数 (def ResNet) 中,我使用了 BatchNormalization 并将其命名为 'bnl_stg-1',我只向其传递了一个输入 (X)。但由于某种原因,它连接到身份和卷积块中的所有 BatchNorm 层,如图所示。
代码如下:
def ResNet(input_shape, features):
'''
Implements the ResNet50 Model
[Conv2D -> BatchNorm -> ReLU -> MaxPool2D] --> [ConvBlock -> IdentityBlock * 2] --> [ConvBlock -> IdentityBlock * 3] --> [AveragePool2D -> Flatten -> Dense -> Sigmoid]
'''
X_input = Input(input_shape)
X = ZeroPadding2D((3, 3))(X_input)
# Stage 1
X = Conv2D(filters = 64,
kernel_size = (7, 7),
strides = (2, 2),
name = 'cnl_stg-1',
kernel_initializer = 'glorot_uniform')(X)
X = BatchNormalization(axis = 3,
name = 'bnl_stg-1')(X)
X = Activation('relu')(X)
X = MaxPooling2D(pool_size=(3, 3),
strides=(2, 2))(X)
# Stage 2
X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, s = 1)
X = identity_block(X, 3, [64, 64, 256], stage=2, block=1)
X = identity_block(X, 3, [64, 64, 256], stage=2, block=2)
# Stage 3
X = convolutional_block(X, f = 3, filters = [128, 128, 512], stage = 3, s = 2)
X = identity_block(X, 3, [128, 128, 512], stage = 3, block = 1)
X = identity_block(X, 3, [128, 128, 512], stage = 3, block = 2)
X = identity_block(X, 3, [128, 128, 512], stage = 3, block = 3)
#Final Stage
X = AveragePooling2D(pool_size = (2, 2),
strides = (2, 2))(X)
X = Flatten()(X)
X = Dense(features, activation='sigmoid', name='fc' + str(features), kernel_initializer = 'glorot_uniform')(X)
# Create model
model = Model(inputs = X_input, outputs = X, name='ResNet')
return model
Snapshot of the Graph
bnl-stg-1(BatchNormLayer-Stage-1)
你不用担心。 Batch Normalization 行为在训练和学习之间发生变化,因此 Keras 添加了一个布尔变量来控制它(keras_learning_phase 如果我没记错的话)。这就是为什么所有这些层都是相连的。 您可以期待与 Dropout 层类似的行为。