组合模型时不考虑批量尺寸
Batch-size dimension not honored when composing models
我有一个 Keras 模型,我在训练期间将其定义为:
img = keras.Input(shape=[65, 65, 2])
bnorm = keras.layers.BatchNormalization()(img)
...
model = keras.Model(img, outputprob)
不过,在服务期间,我的输入有所不同。因此,我定义了一个输入层(验证 to_img
形状也是 (65, 65, 2)
)并尝试使用以下方法进行模型合成:
to_img = keras.layers.Lambda(...)(json_input)
model_output = model(to_img)
serving_model = keras.Model(json_input, model_output)
但是,我得到这个错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError:
Shape must be rank 4 but is rank 3 for
'model/batch_normalization/cond/FusedBatchNorm' (op:
'FusedBatchNorm') with input shapes: [65,65,2],
[2], [2], [0], [0].
这似乎表明批次维度没有通过。为什么?
编辑:
我尝试过的事情:
(1) 在所有层中显式设置 trainable=False
但这似乎没有任何区别:
model_core = model
for layer in model_core.layers:
layer.trainable = False
model_output = model_core(to_img)
(2) 尝试扩展预处理的结果:
to_img = keras.layers.Lambda(
lambda x : preproc(x))(json_input)
to_img = keras.layers.Lambda(
lambda x : tf.expand_dims(x, axis=0) )(to_img)
这会导致错误:AttributeError: 'Model' object has no attribute '_name'
行 serving_model = keras.Model(json_input, model_output)
(3) 更改了 lambda 层以执行 map_fn 以单独处理数据:
to_img = keras.layers.Lambda(
lambda items: K.map_fn(lambda x: preproc, items))(json_input)
这导致了一个形状错误,表明预处理函数正在获取 [65,2] 个项目而不是 [65,65,2] 个项目。这表明 Lambda 层一次将函数应用于一个示例。
(4) 这是模型的完整代码:
img = keras.Input(shape=[height, width, 2])
# convolutional part of model
cnn = keras.layers.BatchNormalization()(img)
for layer in range(nlayers):
nfilters = nfil * (layer + 1)
cnn = keras.layers.Conv2D(nfilters, (ksize, ksize), padding='same')(cnn)
cnn = keras.layers.Activation('elu')(cnn)
cnn = keras.layers.BatchNormalization()(cnn)
cnn = keras.layers.MaxPooling2D(pool_size=(2, 2))(cnn)
cnn = keras.layers.Flatten()(cnn)
cnn = keras.layers.Dropout(dprob)(cnn)
cnn = keras.layers.Dense(10, activation='relu')(cnn)
# feature engineering part of model
engfeat = keras.layers.Lambda(
lambda x: engineered_features(x, height//2))(img)
# concatenate the two parts
both = keras.layers.concatenate([cnn, engfeat])
ltgprob = keras.layers.Dense(1, activation='sigmoid')(both)
# create a model
model = keras.Model(img, ltgprob)
def rmse(y_true, y_pred):
import tensorflow.keras.backend as K
return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
optimizer = tf.keras.optimizers.Adam(lr=params['learning_rate'],
clipnorm=1.)
model.compile(optimizer=optimizer,
loss='binary_crossentropy',
metrics=['accuracy', 'mse', rmse])
和预处理函数的代码:
def reshape_into_image(features, params):
# stack the inputs to form a 2-channel input
# features['ref'] is [-1, height*width]
# stacked image is [-1, height*width, n_channels]
n_channels = 2
stacked = tf.concat([features['ref'], features['ltg']], axis=1)
height = width = PATCH_SIZE(params)
return tf.reshape(stacked, [height, width, n_channels])
和服务层:
# 1. layer that extracts multiple inputs from JSON
height = width = PATCH_SIZE(hparams)
json_input = keras.layers.concatenate([
keras.layers.Input(name='ref', dtype=tf.float32, shape=(height * width,)),
keras.layers.Input(name='ltg', dtype=tf.float32, shape=(height * width,)),
], axis=0)
# 2. convert json_input to image (what model wants)
to_img = keras.layers.Lambda(
lambda x: reshape_into_image(features={
'ref': tf.reshape(x[0], [height * width, 1]),
'ltg': tf.reshape(x[1], [height * width, 1])
}, params=hparams),
name='serving_reshape')(json_input)
# 3. now, use trained model to predict
model_output = model(to_img)
# 4. create serving model
serving_model = keras.Model(json_input, model_output)
考虑到样本轴,模型的输入形状是 (?, 65, 65, 2)
,其中 ?
可以是一个或多个。因此,您需要修改 Lambda 层(实际上是包裹在其中的函数),使其输出也为 (?, 65, 65, 2)
。一种方法是在包装函数 中使用 K.expand_dims(out, axis=0)
以便输出的形状为 (1, 65, 65, 2)
.
顺便说一句,K
指的是后端:from keras import backend as K
.
此外,请注意,您必须定义 Lambda 包装的函数,以便它保留批处理轴;否则,您很可能在该函数的定义中做错了。
更新:
出现错误 AttributeError: 'Model' object has no attribute '_name'
是因为您将 json_input
作为模型的输入传递。但是,它不是输入层。相反,它是 concatenation
层的输出。要解决这个问题,首先定义输入层,然后将它们传递给 concatenation
层和 Model
class,像这样:
inputs = [keras.layers.Input(name='ref', dtype=tf.float32, shape=(height * width,)),
keras.layers.Input(name='ltg', dtype=tf.float32, shape=(height * width,))]
json_input = keras.layers.concatenate(inputs, axis=0)
# ...
serving_model = keras.Model(inputs, model_output)
更新二:
我认为您可以更简单地编写此代码,而不会遇到太多不必要的麻烦。你想从两个形状 (?, h*w)
的张量变成一个形状 (?, h, w, 2)
的张量。您可以使用 Reshape
层,这样会是:
from keras.layers import Reshape
inputs = [keras.layers.Input(name='ref', dtype=tf.float32, shape=(height * width,)),
keras.layers.Input(name='ltg', dtype=tf.float32, shape=(height * width,))]
reshape_layer = Reshape((height, width, 1))
r_in1 = reshape_layer(inputs[0])
r_in2 = reshape_layer(inputs[1])
img = concatenate([r_in1, r_in2])
output = model(img)
serving_model = keras.Model(inputs, output)
无需任何自定义函数或 Lambda 层。
顺便说一句,如果你有兴趣知道,批量删除轴的问题是由这一行引起的:
return tf.reshape(stacked, [height, width, n_channels])
您在整形时没有考虑批量轴。
我有一个 Keras 模型,我在训练期间将其定义为:
img = keras.Input(shape=[65, 65, 2])
bnorm = keras.layers.BatchNormalization()(img)
...
model = keras.Model(img, outputprob)
不过,在服务期间,我的输入有所不同。因此,我定义了一个输入层(验证 to_img
形状也是 (65, 65, 2)
)并尝试使用以下方法进行模型合成:
to_img = keras.layers.Lambda(...)(json_input)
model_output = model(to_img)
serving_model = keras.Model(json_input, model_output)
但是,我得到这个错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError:
Shape must be rank 4 but is rank 3 for
'model/batch_normalization/cond/FusedBatchNorm' (op:
'FusedBatchNorm') with input shapes: [65,65,2],
[2], [2], [0], [0].
这似乎表明批次维度没有通过。为什么?
编辑: 我尝试过的事情:
(1) 在所有层中显式设置 trainable=False
但这似乎没有任何区别:
model_core = model
for layer in model_core.layers:
layer.trainable = False
model_output = model_core(to_img)
(2) 尝试扩展预处理的结果:
to_img = keras.layers.Lambda(
lambda x : preproc(x))(json_input)
to_img = keras.layers.Lambda(
lambda x : tf.expand_dims(x, axis=0) )(to_img)
这会导致错误:AttributeError: 'Model' object has no attribute '_name'
行 serving_model = keras.Model(json_input, model_output)
(3) 更改了 lambda 层以执行 map_fn 以单独处理数据:
to_img = keras.layers.Lambda(
lambda items: K.map_fn(lambda x: preproc, items))(json_input)
这导致了一个形状错误,表明预处理函数正在获取 [65,2] 个项目而不是 [65,65,2] 个项目。这表明 Lambda 层一次将函数应用于一个示例。
(4) 这是模型的完整代码:
img = keras.Input(shape=[height, width, 2])
# convolutional part of model
cnn = keras.layers.BatchNormalization()(img)
for layer in range(nlayers):
nfilters = nfil * (layer + 1)
cnn = keras.layers.Conv2D(nfilters, (ksize, ksize), padding='same')(cnn)
cnn = keras.layers.Activation('elu')(cnn)
cnn = keras.layers.BatchNormalization()(cnn)
cnn = keras.layers.MaxPooling2D(pool_size=(2, 2))(cnn)
cnn = keras.layers.Flatten()(cnn)
cnn = keras.layers.Dropout(dprob)(cnn)
cnn = keras.layers.Dense(10, activation='relu')(cnn)
# feature engineering part of model
engfeat = keras.layers.Lambda(
lambda x: engineered_features(x, height//2))(img)
# concatenate the two parts
both = keras.layers.concatenate([cnn, engfeat])
ltgprob = keras.layers.Dense(1, activation='sigmoid')(both)
# create a model
model = keras.Model(img, ltgprob)
def rmse(y_true, y_pred):
import tensorflow.keras.backend as K
return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
optimizer = tf.keras.optimizers.Adam(lr=params['learning_rate'],
clipnorm=1.)
model.compile(optimizer=optimizer,
loss='binary_crossentropy',
metrics=['accuracy', 'mse', rmse])
和预处理函数的代码:
def reshape_into_image(features, params):
# stack the inputs to form a 2-channel input
# features['ref'] is [-1, height*width]
# stacked image is [-1, height*width, n_channels]
n_channels = 2
stacked = tf.concat([features['ref'], features['ltg']], axis=1)
height = width = PATCH_SIZE(params)
return tf.reshape(stacked, [height, width, n_channels])
和服务层:
# 1. layer that extracts multiple inputs from JSON
height = width = PATCH_SIZE(hparams)
json_input = keras.layers.concatenate([
keras.layers.Input(name='ref', dtype=tf.float32, shape=(height * width,)),
keras.layers.Input(name='ltg', dtype=tf.float32, shape=(height * width,)),
], axis=0)
# 2. convert json_input to image (what model wants)
to_img = keras.layers.Lambda(
lambda x: reshape_into_image(features={
'ref': tf.reshape(x[0], [height * width, 1]),
'ltg': tf.reshape(x[1], [height * width, 1])
}, params=hparams),
name='serving_reshape')(json_input)
# 3. now, use trained model to predict
model_output = model(to_img)
# 4. create serving model
serving_model = keras.Model(json_input, model_output)
考虑到样本轴,模型的输入形状是 (?, 65, 65, 2)
,其中 ?
可以是一个或多个。因此,您需要修改 Lambda 层(实际上是包裹在其中的函数),使其输出也为 (?, 65, 65, 2)
。一种方法是在包装函数 中使用 K.expand_dims(out, axis=0)
以便输出的形状为 (1, 65, 65, 2)
.
顺便说一句,K
指的是后端:from keras import backend as K
.
此外,请注意,您必须定义 Lambda 包装的函数,以便它保留批处理轴;否则,您很可能在该函数的定义中做错了。
更新:
出现错误 AttributeError: 'Model' object has no attribute '_name'
是因为您将 json_input
作为模型的输入传递。但是,它不是输入层。相反,它是 concatenation
层的输出。要解决这个问题,首先定义输入层,然后将它们传递给 concatenation
层和 Model
class,像这样:
inputs = [keras.layers.Input(name='ref', dtype=tf.float32, shape=(height * width,)),
keras.layers.Input(name='ltg', dtype=tf.float32, shape=(height * width,))]
json_input = keras.layers.concatenate(inputs, axis=0)
# ...
serving_model = keras.Model(inputs, model_output)
更新二:
我认为您可以更简单地编写此代码,而不会遇到太多不必要的麻烦。你想从两个形状 (?, h*w)
的张量变成一个形状 (?, h, w, 2)
的张量。您可以使用 Reshape
层,这样会是:
from keras.layers import Reshape
inputs = [keras.layers.Input(name='ref', dtype=tf.float32, shape=(height * width,)),
keras.layers.Input(name='ltg', dtype=tf.float32, shape=(height * width,))]
reshape_layer = Reshape((height, width, 1))
r_in1 = reshape_layer(inputs[0])
r_in2 = reshape_layer(inputs[1])
img = concatenate([r_in1, r_in2])
output = model(img)
serving_model = keras.Model(inputs, output)
无需任何自定义函数或 Lambda 层。
顺便说一句,如果你有兴趣知道,批量删除轴的问题是由这一行引起的:
return tf.reshape(stacked, [height, width, n_channels])
您在整形时没有考虑批量轴。