具有自定义损失函数的 Tensorflow 模型没有完成训练
Tensorflow model with custom loss function gets no training done
我创建了一个自定义损失函数,如下所示:
import tensorflow.keras.backend as K
def custom_loss(y_true, y_pred):
y_true = K.cast(y_true, tf.float32)
y_pred = K.cast(y_pred, tf.float32)
mask = K.sign(y_true) * K.sign(y_pred)
mask = ((mask * -1) + 1) / 2
losses = K.abs(y_true * mask)
return K.sum(losses)
但是,当我尝试使用此损失函数训练模型时,我没有完成任何训练。
该模型与其他损失函数(例如 mse 和 mae)一起正常工作,并且我已经尝试了所有学习率和模型复杂度。
以下是我知道没有进行培训的方式。
model = get_compiled_model()
print(model.predict(train_x)[:10])
model.fit(train_x, train_y, epochs=5, verbose=1)
print(model.predict(train_x)[:10])
model.fit(train_x, train_y, epochs=5, verbose=1)
print(model.predict(train_x)[:10])
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
Epoch 1/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 2/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 3/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 4/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 5/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
Epoch 1/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 2/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 3/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 4/5
1/1 [==============================] - 0s 951us/step - loss: 0.0179
Epoch 5/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
上面代码中的2d数组是模型的前10个预测,即使训练5个epoch也没有丝毫变化。
我的直觉告诉我损失函数有问题,但我不知道是什么。
模型如下所示
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, input_dim=2*training_size+1, activation='softmax'),
tf.keras.layers.Dense(10, activation='softmax'),
tf.keras.layers.Dense(1, activation='tanh')
])
opt = tf.keras.optimizers.Adam(learning_rate=0.0005)
model.compile(optimizer=opt,
loss=custom_loss,
metrics=[])
return model
我正在使用您的模型和损失函数处理一些假数据,我想 check the derivatives。
if __name__=="__main__":
m = get_compiled_model()
x = numpy.random.random( (1000, 21))
x = numpy.array(x, dtype="float32")
exp_y = numpy.random.random( (1000, 1))
exp_y = (exp_y>0.5)*1.0
with tf.GradientTape() as tape:
y = m(x)
loss = custom_loss(y, exp_y)
#loss = keras.losses.mse(y, exp_y)
grad = tape.gradient(loss, m.trainable_variables)
for var, g in zip(m.trainable_variables, grad):
print(f'{var.name}, shape: {K.sum(g*g)}')
对于mse损失函数:
dense/kernel:0, shape: 2817.013671875
dense/bias:0, shape: 530.52197265625
dense_1/kernel:0, shape: 3826.3974609375
dense_1/bias:0, shape: 25160.9375
dense_2/kernel:0, shape: 125238.34375
dense_2/bias:0, shape: 1241268.5
对于自定义损失函数
dense/kernel:0, shape: 34.87071228027344
dense/bias:0, shape: 6.609962463378906
dense_1/kernel:0, shape: 107.27591705322266
dense_1/bias:0, shape: 824.83740234375
dense_2/kernel:0, shape: 5944.91796875
dense_2/bias:0, shape: 59201.58203125
我们可以看到导数之和相差几个数量级。即使使用这种随机数据,MSE 损失函数也会导致模型的输出随时间发生变化。
这可能只是我做的假数据的情况。
我创建了一个自定义损失函数,如下所示:
import tensorflow.keras.backend as K
def custom_loss(y_true, y_pred):
y_true = K.cast(y_true, tf.float32)
y_pred = K.cast(y_pred, tf.float32)
mask = K.sign(y_true) * K.sign(y_pred)
mask = ((mask * -1) + 1) / 2
losses = K.abs(y_true * mask)
return K.sum(losses)
但是,当我尝试使用此损失函数训练模型时,我没有完成任何训练。 该模型与其他损失函数(例如 mse 和 mae)一起正常工作,并且我已经尝试了所有学习率和模型复杂度。
以下是我知道没有进行培训的方式。
model = get_compiled_model()
print(model.predict(train_x)[:10])
model.fit(train_x, train_y, epochs=5, verbose=1)
print(model.predict(train_x)[:10])
model.fit(train_x, train_y, epochs=5, verbose=1)
print(model.predict(train_x)[:10])
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
Epoch 1/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 2/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 3/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 4/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 5/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
Epoch 1/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 2/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 3/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 4/5
1/1 [==============================] - 0s 951us/step - loss: 0.0179
Epoch 5/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
上面代码中的2d数组是模型的前10个预测,即使训练5个epoch也没有丝毫变化。
我的直觉告诉我损失函数有问题,但我不知道是什么。
模型如下所示
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, input_dim=2*training_size+1, activation='softmax'),
tf.keras.layers.Dense(10, activation='softmax'),
tf.keras.layers.Dense(1, activation='tanh')
])
opt = tf.keras.optimizers.Adam(learning_rate=0.0005)
model.compile(optimizer=opt,
loss=custom_loss,
metrics=[])
return model
我正在使用您的模型和损失函数处理一些假数据,我想 check the derivatives。
if __name__=="__main__":
m = get_compiled_model()
x = numpy.random.random( (1000, 21))
x = numpy.array(x, dtype="float32")
exp_y = numpy.random.random( (1000, 1))
exp_y = (exp_y>0.5)*1.0
with tf.GradientTape() as tape:
y = m(x)
loss = custom_loss(y, exp_y)
#loss = keras.losses.mse(y, exp_y)
grad = tape.gradient(loss, m.trainable_variables)
for var, g in zip(m.trainable_variables, grad):
print(f'{var.name}, shape: {K.sum(g*g)}')
对于mse损失函数:
dense/kernel:0, shape: 2817.013671875
dense/bias:0, shape: 530.52197265625
dense_1/kernel:0, shape: 3826.3974609375
dense_1/bias:0, shape: 25160.9375
dense_2/kernel:0, shape: 125238.34375
dense_2/bias:0, shape: 1241268.5
对于自定义损失函数
dense/kernel:0, shape: 34.87071228027344
dense/bias:0, shape: 6.609962463378906
dense_1/kernel:0, shape: 107.27591705322266
dense_1/bias:0, shape: 824.83740234375
dense_2/kernel:0, shape: 5944.91796875
dense_2/bias:0, shape: 59201.58203125
我们可以看到导数之和相差几个数量级。即使使用这种随机数据,MSE 损失函数也会导致模型的输出随时间发生变化。
这可能只是我做的假数据的情况。