Keras LSTM 模型过拟合
Keras LSTM model overfitting
我在 Keras 中使用 LSTM 模型。在拟合阶段,我添加了 validation_data 参数。当我绘制我的训练与验证损失时,似乎存在严重的过度拟合问题。我的验证损失不会减少。
我的完整数据是一个形状为 [50,]
的序列。前 20 条记录用作训练,其余用于测试数据。
我已经尝试添加 dropout 并尽可能降低模型的复杂性,但仍然没有成功。
# transform data to be stationary
raw_values = series.values
diff_values = difference_series(raw_values, 1)
# transform data to be supervised learning
# using a sliding window
supervised = timeseries_to_supervised(diff_values, 1)
supervised_values = supervised.values
# split data into train and test-sets
train, test = supervised_values[:20], supervised_values[20:]
# transform the scale of the data
# scale function uses MinMaxScaler(feature_range=(-1,1)) and fit via training set and is applied to both train and test.
scaler, train_scaled, test_scaled = scale(train, test)
batch_size = 1
nb_epoch = 1000
neurons = 1
X, y = train_scaled[:, 0:-1], train_scaled[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
testX, testY = test_scaled[:, 0:-1].reshape(-1,1,1), test_scaled[:, -1]
model = Sequential()
model.add(LSTM(units=neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]),
stateful=True))
model.add(Dropout(0.1))
model.add(Dense(1, activation="linear"))
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(X, y, epochs=nb_epoch, batch_size=batch_size, verbose=0, shuffle=False,
validation_data=(testX, testY))
这是改变神经元数量时的样子。我什至尝试使用 Keras Tuner (hyperband) 来找到最佳参数。
def fit_model(hp):
batch_size = 1
model = Sequential()
model.add(LSTM(units=hp.Int("units", min_value=1,
max_value=20, step=1),
batch_input_shape=(batch_size, X.shape[1], X.shape[2]),
stateful=True))
model.add(Dense(units=hp.Int("units", min_value=1, max_value=10),
activation="linear"))
model.compile(loss='mse', metrics=["mse"],
optimizer=keras.optimizers.Adam(
hp.Choice("learning_rate", values=[1e-2, 1e-3, 1e-4])))
return model
X, y = train_scaled[:, 0:-1], train_scaled[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
tuner = kt.Hyperband(
fit_model,
objective='mse',
max_epochs=100,
hyperband_iterations=2,
overwrite=True)
tuner.search(X, y, epochs=100, validation_split=0.2)
根据 X_test
和 y_test
评估模型时,我得到相同的损失和准确性分数。但是当拟合“最佳模型”时,我得到这个:
但是,根据我的真实值,我的预测看起来非常合理。我应该怎么做才能更合身?
20 条记录,因为训练数据太小。训练数据中的变化不足以使模型准确地逼近函数,因此您的验证数据(可能远小于 20)可能包含一个与训练数据中的 20 个截然不同的示例(即它在训练期间没有看到这种性质的例子)导致损失更高。
我在 Keras 中使用 LSTM 模型。在拟合阶段,我添加了 validation_data 参数。当我绘制我的训练与验证损失时,似乎存在严重的过度拟合问题。我的验证损失不会减少。
我的完整数据是一个形状为 [50,]
的序列。前 20 条记录用作训练,其余用于测试数据。
我已经尝试添加 dropout 并尽可能降低模型的复杂性,但仍然没有成功。
# transform data to be stationary
raw_values = series.values
diff_values = difference_series(raw_values, 1)
# transform data to be supervised learning
# using a sliding window
supervised = timeseries_to_supervised(diff_values, 1)
supervised_values = supervised.values
# split data into train and test-sets
train, test = supervised_values[:20], supervised_values[20:]
# transform the scale of the data
# scale function uses MinMaxScaler(feature_range=(-1,1)) and fit via training set and is applied to both train and test.
scaler, train_scaled, test_scaled = scale(train, test)
batch_size = 1
nb_epoch = 1000
neurons = 1
X, y = train_scaled[:, 0:-1], train_scaled[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
testX, testY = test_scaled[:, 0:-1].reshape(-1,1,1), test_scaled[:, -1]
model = Sequential()
model.add(LSTM(units=neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]),
stateful=True))
model.add(Dropout(0.1))
model.add(Dense(1, activation="linear"))
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(X, y, epochs=nb_epoch, batch_size=batch_size, verbose=0, shuffle=False,
validation_data=(testX, testY))
这是改变神经元数量时的样子。我什至尝试使用 Keras Tuner (hyperband) 来找到最佳参数。
def fit_model(hp):
batch_size = 1
model = Sequential()
model.add(LSTM(units=hp.Int("units", min_value=1,
max_value=20, step=1),
batch_input_shape=(batch_size, X.shape[1], X.shape[2]),
stateful=True))
model.add(Dense(units=hp.Int("units", min_value=1, max_value=10),
activation="linear"))
model.compile(loss='mse', metrics=["mse"],
optimizer=keras.optimizers.Adam(
hp.Choice("learning_rate", values=[1e-2, 1e-3, 1e-4])))
return model
X, y = train_scaled[:, 0:-1], train_scaled[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
tuner = kt.Hyperband(
fit_model,
objective='mse',
max_epochs=100,
hyperband_iterations=2,
overwrite=True)
tuner.search(X, y, epochs=100, validation_split=0.2)
根据 X_test
和 y_test
评估模型时,我得到相同的损失和准确性分数。但是当拟合“最佳模型”时,我得到这个:
但是,根据我的真实值,我的预测看起来非常合理。我应该怎么做才能更合身?
20 条记录,因为训练数据太小。训练数据中的变化不足以使模型准确地逼近函数,因此您的验证数据(可能远小于 20)可能包含一个与训练数据中的 20 个截然不同的示例(即它在训练期间没有看到这种性质的例子)导致损失更高。