在 Tensorflow 模型中实现规范化

Question

我目前正在使用 Tensorflow 库研究基于 LSTM 的基本自动编码器。目标是让自动编码器重建多变量时间序列。我有兴趣将数据的特征规范化从数据管道移动到模型内部。

目前我通过以下方式规范化数据：

normalizer = Normalization(axis=-1)
normalizer.adapt(data_train)
data_train = normalizer(data_train)

inputs = Input(shape=[None, n_inputs])
x = LSTM(4, return_sequences=True)(inputs)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(n_inputs)))(x)
model = Model(inputs, x)

按预期工作，导致可观的损失 (~1e-2) 但不在模型范围内。根据 documentation（在“模型前或模型内部预处理数据”下），下面的代码应该等同于上面的代码片段，只是它在模型内部运行：

normalizer = Normalization(axis=-1)
normalizer.adapt(data_train)

inputs = Input(shape=[None, n_inputs])
x = normalizer(inputs)
x = LSTM(4, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(n_inputs)))(x)
model = Model(inputs, x)

然而，运行后一种变体会导致天文数字的损失值 (~1e3) 并且测试结果也更差。因此我的问题是：我做错了什么？会不会是我误解了文档？

非常感谢任何建议！

Answer 1

只要规范化器在模型外部使用时仅应用于输入（即特征矩阵），这两种方法似乎会给出一致的结果：

import numpy as np
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, LSTM, TimeDistributed
from tensorflow.keras.models import Model
from tensorflow.keras.layers.experimental.preprocessing import Normalization
np.random.seed(42)

# define the input parameters
num_samples = 100
time_steps = 10
train_size = 0.8

# generate the data
X = np.random.normal(loc=10, scale=5, size=(num_samples, time_steps, 1))
y = np.mean(X, axis=1) + np.random.normal(loc=0, scale=1, size=(num_samples, 1))

# split the data
X_train, X_test = X[:np.int(train_size * X.shape[0]), :], X[np.int(train_size * X.shape[0]):, :]
y_train, y_test = y[:np.int(train_size * y.shape[0]), :], y[np.int(train_size * y.shape[0]):, :]

# normalize the inputs inside the model
normalizer = Normalization()
normalizer.adapt(X_train)

inputs = Input(shape=[None, 1])
x = normalizer(inputs)
x = LSTM(4, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(1)))(x)
model = Model(inputs, x)

model.compile(loss='mae', optimizer='adam')
model.fit(X_train, y_train, batch_size=32, epochs=10, verbose=0)

print(model.evaluate(X_test, y_test))
# 10.704551696777344

# normalize the inputs outside the model
normalizer = Normalization()
normalizer.adapt(X_train)

X_train_normalized = normalizer(X_train)
X_test_normalized = normalizer(X_test)

inputs = Input(shape=[None, 1])
x = LSTM(4, return_sequences=True)(inputs)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(1)))(x)
model = Model(inputs, x)

model.compile(loss='mae', optimizer='adam')
model.fit(X_train_normalized, y_train, batch_size=32, epochs=10, verbose=0)

print(model.evaluate(X_test_normalized, y_test))
# 10.748750686645508

在 Tensorflow 模型中实现规范化

Implementing Normalization Inside Tensorflow Model

python

normalization

tensorflow