Tensorflow 2:如何在调用方法中拟合 returns 多个值的子类模型?
Tensorflow 2: How to fit a subclassed model that returns multiple values in the call method?
我通过 TensorFlow 2 中的模型子类化构建了以下模型:
from tensorflow.keras import Model, Input
from tensorflow.keras.applications import DenseNet201
from tensorflow.keras.applications.densenet import preprocess_input
from tensorflow.keras.layers import Flatten, Dense
class Detector(Model):
def __init__(self, num_classes=3, name="DenseNet201"):
super(Detector, self).__init__(name=name)
self.feature_extractor = DenseNet201(
include_top=False,
weights="imagenet",
)
self.feature_extractor.trainable = False
self.flatten_layer = Flatten()
self.prediction_layer = Dense(num_classes, activation=None)
def call(self, inputs):
x = preprocess_input(inputs)
extracted_feature = self.feature_extractor(x, training=False)
x = self.flatten_layer(extracted_feature)
y_hat = self.prediction_layer(x)
return extracted_feature, y_hat
后续步骤是编译和拟合模型。该模型编译正常,但在拟合我的图像生成器(从 ImageDataGenerator
构建)时,我遇到了错误:InvalidArgumentError: Incompatible shapes: [64,18,18] vs. [64,1] [[node Equal (定义于:19) ]] [Op:__inference_train_function_32187] 函数调用堆栈:train_function –.
history = detector.fit(
train_generator,
epochs=1,
validation_data=val_generator,
callbacks=callbacks
)
这很明显,因为 TensorFlow 不知道在 detector.fit()
期间预测是 y_hat
还是 extracted_feature
,因此抛出了错误。那么,对于我的情况,detector.fit
的正确实施是什么?
在此 之后,您应该首先使用(比方说)一个输入和一个输出来训练您的模型。稍后如果你想计算 grad-cam,你会选择你的基础模型的一些 中间层 (不是基础模型的最终输出),在这种情况下,你需要构建您的特征提取器分开。例如
# (let's say: one input and one output)
# use for training
base_model = keras.application(...)
x = base_model(..)
dese_drop_bn_[whatever] = x
out = dese_drop_bn_[whatever]
model = Model(base_model.input, out)
# inference / we need to compute grad cam
new_model = tf.keras.models.Model(model.input,
[model.layers[15].output, model.output])
上面的model
是用来训练的,后面推理的时候如果需要根据图层计算grad-cam,比如第15层,就需要构建new_model
具有适当的输出。希望这能让事情变得清楚。有关特征提取的更多信息,请参阅官方文档 Extract and reuse nodes in the graph of layers2. FYI, the exact same things are happening as I informed you earlier. Also, check this official code example,您将在那里看到完全相同的内容。
不过,我认为还有另一种方法可能更适合您。也就是说,当您使用自定义模型时,我们可以在 call()
方法中使用特权 training
参数。通常在训练时间,这是 True
,而在推理时间,它是 False
。所以,基于此,我们可以return相应地输出想要的结果。这是完整的代码示例:
import tensorflow as tf
# get some data
data_dir = tf.keras.utils.get_file(
'flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
untar=True)
datagen_kwargs = dict(rescale=1./255, validation_split=.20)
dataflow_kwargs = dict(target_size=(64, 64),
batch_size=16,
interpolation="bilinear")
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=40,
horizontal_flip=True,
width_shift_range=0.2, height_shift_range=0.2,
shear_range=0.2, zoom_range=0.2,
**datagen_kwargs)
train_generator = train_datagen.flow_from_directory(
data_dir, subset="training", shuffle=True, **dataflow_kwargs)
for image, label in train_generator:
print(image.shape, image.dtype)
print(label.shape, label.dtype)
print(label[:4])
break
(16, 64, 64, 3) float32
(16, 5) float32
[[0. 0. 0. 0. 1.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1.]]
这里我们根据 call
方法中 training
的布尔值来做这个技巧。
class Detector(Model):
def __init__(self, num_classes=5, name="DenseNet201"):
super(Detector, self).__init__(name=name)
self.feature_extractor = DenseNet201(
include_top=False,
weights="imagenet",
)
self.feature_extractor.trainable = False
self.flatten_layer = Flatten()
self.prediction_layer = Dense(num_classes, activation='softmax')
def call(self, inputs, training):
x = preprocess_input(inputs)
extracted_feature = self.feature_extractor(x, training=False)
x = self.flatten_layer(extracted_feature)
y_hat = self.prediction_layer(x)
if training:
return y_hat
else:
return [y_hat, extracted_feature]
火车
det = Detector()
det.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['acc'])
train_step = train_generator.samples // train_generator.batch_size
det.fit(train_generator,
steps_per_epoch=train_step,
validation_data=train_generator,
validation_steps=train_step,
epochs=2, verbose=2)
Epoch 1/2
37s 139ms/step - loss: 1.7543 - acc: 0.2650 - val_loss: 1.5310 - val_acc: 0.3764
Epoch 2/2
21s 115ms/step - loss: 1.4913 - acc: 0.3915 - val_loss: 1.3066 - val_acc: 0.4667
<tensorflow.python.keras.callbacks.History at 0x7fa2890b1790>
评价
det.evaluate(train_generator,
steps=train_step)
4s 76ms/step - loss: 1.3066 - acc: 0.4667
[1.3065541982650757, 0.46666666865348816]
推理
在这里,我们将获得该模型的两个输出(与我们在训练时间内获得的 1 个输出不同)。
y_hat, base_feature = det.predict(train_generator,
steps=train_step)
y_hat.shape, base_feature.shape
((720, 5), (720, 2, 2, 1920))
现在,你可以做 grad-cam 或任何需要这样特征图的东西。
我通过 TensorFlow 2 中的模型子类化构建了以下模型:
from tensorflow.keras import Model, Input
from tensorflow.keras.applications import DenseNet201
from tensorflow.keras.applications.densenet import preprocess_input
from tensorflow.keras.layers import Flatten, Dense
class Detector(Model):
def __init__(self, num_classes=3, name="DenseNet201"):
super(Detector, self).__init__(name=name)
self.feature_extractor = DenseNet201(
include_top=False,
weights="imagenet",
)
self.feature_extractor.trainable = False
self.flatten_layer = Flatten()
self.prediction_layer = Dense(num_classes, activation=None)
def call(self, inputs):
x = preprocess_input(inputs)
extracted_feature = self.feature_extractor(x, training=False)
x = self.flatten_layer(extracted_feature)
y_hat = self.prediction_layer(x)
return extracted_feature, y_hat
后续步骤是编译和拟合模型。该模型编译正常,但在拟合我的图像生成器(从 ImageDataGenerator
构建)时,我遇到了错误:InvalidArgumentError: Incompatible shapes: [64,18,18] vs. [64,1] [[node Equal (定义于:19) ]] [Op:__inference_train_function_32187] 函数调用堆栈:train_function –.
history = detector.fit(
train_generator,
epochs=1,
validation_data=val_generator,
callbacks=callbacks
)
这很明显,因为 TensorFlow 不知道在 detector.fit()
期间预测是 y_hat
还是 extracted_feature
,因此抛出了错误。那么,对于我的情况,detector.fit
的正确实施是什么?
在此
# (let's say: one input and one output)
# use for training
base_model = keras.application(...)
x = base_model(..)
dese_drop_bn_[whatever] = x
out = dese_drop_bn_[whatever]
model = Model(base_model.input, out)
# inference / we need to compute grad cam
new_model = tf.keras.models.Model(model.input,
[model.layers[15].output, model.output])
上面的model
是用来训练的,后面推理的时候如果需要根据图层计算grad-cam,比如第15层,就需要构建new_model
具有适当的输出。希望这能让事情变得清楚。有关特征提取的更多信息,请参阅官方文档 Extract and reuse nodes in the graph of layers2. FYI, the exact same things are happening
不过,我认为还有另一种方法可能更适合您。也就是说,当您使用自定义模型时,我们可以在 call()
方法中使用特权 training
参数。通常在训练时间,这是 True
,而在推理时间,它是 False
。所以,基于此,我们可以return相应地输出想要的结果。这是完整的代码示例:
import tensorflow as tf
# get some data
data_dir = tf.keras.utils.get_file(
'flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
untar=True)
datagen_kwargs = dict(rescale=1./255, validation_split=.20)
dataflow_kwargs = dict(target_size=(64, 64),
batch_size=16,
interpolation="bilinear")
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=40,
horizontal_flip=True,
width_shift_range=0.2, height_shift_range=0.2,
shear_range=0.2, zoom_range=0.2,
**datagen_kwargs)
train_generator = train_datagen.flow_from_directory(
data_dir, subset="training", shuffle=True, **dataflow_kwargs)
for image, label in train_generator:
print(image.shape, image.dtype)
print(label.shape, label.dtype)
print(label[:4])
break
(16, 64, 64, 3) float32
(16, 5) float32
[[0. 0. 0. 0. 1.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1.]]
这里我们根据 call
方法中 training
的布尔值来做这个技巧。
class Detector(Model):
def __init__(self, num_classes=5, name="DenseNet201"):
super(Detector, self).__init__(name=name)
self.feature_extractor = DenseNet201(
include_top=False,
weights="imagenet",
)
self.feature_extractor.trainable = False
self.flatten_layer = Flatten()
self.prediction_layer = Dense(num_classes, activation='softmax')
def call(self, inputs, training):
x = preprocess_input(inputs)
extracted_feature = self.feature_extractor(x, training=False)
x = self.flatten_layer(extracted_feature)
y_hat = self.prediction_layer(x)
if training:
return y_hat
else:
return [y_hat, extracted_feature]
火车
det = Detector()
det.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['acc'])
train_step = train_generator.samples // train_generator.batch_size
det.fit(train_generator,
steps_per_epoch=train_step,
validation_data=train_generator,
validation_steps=train_step,
epochs=2, verbose=2)
Epoch 1/2
37s 139ms/step - loss: 1.7543 - acc: 0.2650 - val_loss: 1.5310 - val_acc: 0.3764
Epoch 2/2
21s 115ms/step - loss: 1.4913 - acc: 0.3915 - val_loss: 1.3066 - val_acc: 0.4667
<tensorflow.python.keras.callbacks.History at 0x7fa2890b1790>
评价
det.evaluate(train_generator,
steps=train_step)
4s 76ms/step - loss: 1.3066 - acc: 0.4667
[1.3065541982650757, 0.46666666865348816]
推理
在这里,我们将获得该模型的两个输出(与我们在训练时间内获得的 1 个输出不同)。
y_hat, base_feature = det.predict(train_generator,
steps=train_step)
y_hat.shape, base_feature.shape
((720, 5), (720, 2, 2, 1920))
现在,你可以做 grad-cam 或任何需要这样特征图的东西。