Tensorflow Keras RMSE 指标 returns 与我自己构建的 RMSE 损失函数不同的结果
Tensorflow Keras RMSE metric returns different results than my own built RMSE loss function
这是一个回归问题
我的自定义 RMSE 损失:
def root_mean_squared_error_loss(y_true, y_pred):
return tf.keras.backend.sqrt(tf.keras.losses.MSE(y_true, y_pred))
训练代码示例,其中 create_model returns 一个密集的全连接序列模型
from tensorflow.keras.metrics import RootMeanSquaredError
model = create_model()
model.compile(loss=root_mean_squared_error_loss, optimizer='adam', metrics=[RootMeanSquaredError()])
model.fit(train_.values,
targets,
validation_split=0.1,
verbose=1,
batch_size=32)
Train on 3478 samples, validate on 387 samples
Epoch 1/100
3478/3478 [==============================] - 2s 544us/sample - loss: 1.1983 - root_mean_squared_error: 0.7294 - val_loss: 0.7372 - val_root_mean_squared_error: 0.1274
Epoch 2/100
3478/3478 [==============================] - 1s 199us/sample - loss: 0.8371 - root_mean_squared_error: 0.3337 - val_loss: 0.7090 - val_root_mean_squared_error: 0.1288
Epoch 3/100
3478/3478 [==============================] - 1s 187us/sample - loss: 0.7336 - root_mean_squared_error: 0.2468 - val_loss: 0.6366 - val_root_mean_squared_error: 0.1062
Epoch 4/100
3478/3478 [==============================] - 1s 187us/sample - loss: 0.6668 - root_mean_squared_error: 0.2177 - val_loss: 0.5823 - val_root_mean_squared_error: 0.0818
我预计损失和 root_mean_squared_error 具有相同的值,为什么会有差异?
两个主要区别,来自 source code:
RMSE
是一个 有状态的 指标(它保持记忆) - 你的是无状态的
- 应用平方根after taking a global mean, not before an
axis=-1
mean like MSE does
- 由于 1,2 涉及更多:取 运行 个数量的平均值
total
,相对于另一个 运行 个数量 count
;两个数量都通过 RMSE.reset_states()
. 重置
原始公式修复很容易 - 但集成有状态性需要工作,这超出了这个问题的范围;参考源码看how it's done。 2 的修复与比较,如下。
import numpy as np
import tensorflow as tf
from tensorflow.keras.metrics import RootMeanSquaredError as RMSE
def root_mean_squared_error_loss(y_true, y_pred):
return tf.sqrt(tf.reduce_mean(tf.math.squared_difference(y_true, y_pred)))
np.random.seed(0)
#%%###########################################################################
rmse = RMSE(dtype='float64')
rmsel = root_mean_squared_error_loss
x1 = np.random.randn(32, 10)
y1 = np.random.randn(32, 10)
x2 = np.random.randn(32, 10)
y2 = np.random.randn(32, 10)
#%%###########################################################################
print("TensorFlow RMSE:")
print(rmse(x1, y1))
print(rmse(x2, y2))
print("=" * 46)
print(rmse(x1, y1))
print(rmse(x2, y2))
print("\nMy RMSE:")
print(rmsel(x1, y1))
print(rmsel(x2, y2))
TensorFlow RMSE:
tf.Tensor(1.4132492562096124, shape=(), dtype=float64)
tf.Tensor(1.3875944990740972, shape=(), dtype=float64)
==============================================
tf.Tensor(1.3961984634354354, shape=(), dtype=float64) # same inputs, different result
tf.Tensor(1.3875944990740972, shape=(), dtype=float64) # same inputs, different result
My RMSE:
tf.Tensor(1.4132492562096124, shape=(), dtype=float64) # first result agrees
tf.Tensor(1.3614563994283353, shape=(), dtype=float64) # second differs since stateless
这是一个回归问题
我的自定义 RMSE 损失:
def root_mean_squared_error_loss(y_true, y_pred):
return tf.keras.backend.sqrt(tf.keras.losses.MSE(y_true, y_pred))
训练代码示例,其中 create_model returns 一个密集的全连接序列模型
from tensorflow.keras.metrics import RootMeanSquaredError
model = create_model()
model.compile(loss=root_mean_squared_error_loss, optimizer='adam', metrics=[RootMeanSquaredError()])
model.fit(train_.values,
targets,
validation_split=0.1,
verbose=1,
batch_size=32)
Train on 3478 samples, validate on 387 samples
Epoch 1/100
3478/3478 [==============================] - 2s 544us/sample - loss: 1.1983 - root_mean_squared_error: 0.7294 - val_loss: 0.7372 - val_root_mean_squared_error: 0.1274
Epoch 2/100
3478/3478 [==============================] - 1s 199us/sample - loss: 0.8371 - root_mean_squared_error: 0.3337 - val_loss: 0.7090 - val_root_mean_squared_error: 0.1288
Epoch 3/100
3478/3478 [==============================] - 1s 187us/sample - loss: 0.7336 - root_mean_squared_error: 0.2468 - val_loss: 0.6366 - val_root_mean_squared_error: 0.1062
Epoch 4/100
3478/3478 [==============================] - 1s 187us/sample - loss: 0.6668 - root_mean_squared_error: 0.2177 - val_loss: 0.5823 - val_root_mean_squared_error: 0.0818
我预计损失和 root_mean_squared_error 具有相同的值,为什么会有差异?
两个主要区别,来自 source code:
RMSE
是一个 有状态的 指标(它保持记忆) - 你的是无状态的- 应用平方根after taking a global mean, not before an
axis=-1
mean like MSE does- 由于 1,2 涉及更多:取 运行 个数量的平均值
total
,相对于另一个 运行 个数量count
;两个数量都通过RMSE.reset_states()
. 重置
- 由于 1,2 涉及更多:取 运行 个数量的平均值
原始公式修复很容易 - 但集成有状态性需要工作,这超出了这个问题的范围;参考源码看how it's done。 2 的修复与比较,如下。
import numpy as np
import tensorflow as tf
from tensorflow.keras.metrics import RootMeanSquaredError as RMSE
def root_mean_squared_error_loss(y_true, y_pred):
return tf.sqrt(tf.reduce_mean(tf.math.squared_difference(y_true, y_pred)))
np.random.seed(0)
#%%###########################################################################
rmse = RMSE(dtype='float64')
rmsel = root_mean_squared_error_loss
x1 = np.random.randn(32, 10)
y1 = np.random.randn(32, 10)
x2 = np.random.randn(32, 10)
y2 = np.random.randn(32, 10)
#%%###########################################################################
print("TensorFlow RMSE:")
print(rmse(x1, y1))
print(rmse(x2, y2))
print("=" * 46)
print(rmse(x1, y1))
print(rmse(x2, y2))
print("\nMy RMSE:")
print(rmsel(x1, y1))
print(rmsel(x2, y2))
TensorFlow RMSE:
tf.Tensor(1.4132492562096124, shape=(), dtype=float64)
tf.Tensor(1.3875944990740972, shape=(), dtype=float64)
==============================================
tf.Tensor(1.3961984634354354, shape=(), dtype=float64) # same inputs, different result
tf.Tensor(1.3875944990740972, shape=(), dtype=float64) # same inputs, different result
My RMSE:
tf.Tensor(1.4132492562096124, shape=(), dtype=float64) # first result agrees
tf.Tensor(1.3614563994283353, shape=(), dtype=float64) # second differs since stateless