极差的预测：LSTM 时间序列

Question

我尝试实现用于时间序列预测的 LSTM 模型。下面是我的试用代码。此代码运行无误。不依赖也可以试试

import numpy as np, pandas as pd, matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense, TimeDistributed, Bidirectional
from sklearn.metrics import mean_squared_error, accuracy_score
from scipy.stats import linregress
from sklearn.utils import shuffle

fi = 'pollution.csv'
raw = pd.read_csv(fi, delimiter=',')
raw = raw.drop('Dates', axis=1)
print (raw.shape)

scaler = MinMaxScaler(feature_range=(-1, 1))
raw = scaler.fit_transform(raw)

time_steps = 7
def create_ds(data, t_steps):
    data = pd.DataFrame(data)
    data_s = data.copy()
    for i in range(time_steps):
        data = pd.concat([data, data_s.shift(-(i+1))], axis = 1)   
    data.dropna(axis=0, inplace=True)
    return data.values

ds = create_ds(raw, time_steps)
print (ds.shape)
n_feats = raw.shape[1]
n_obs = time_steps * n_feats

n_rows = ds.shape[0]
train_size = int(n_rows * 0.8)

train_data = ds[:train_size, :]
train_data = shuffle(train_data)

test_data = ds[train_size:, :]

x_train = train_data[:, :n_obs]
y_train = train_data[:, n_obs:]
x_test = test_data[:, :n_obs]
y_test = test_data[:, n_obs:]

x_train = x_train.reshape(1, x_train.shape[0], x_train.shape[1])
y_train = y_train.reshape(1, y_train.shape[0], y_train.shape[1])
x_test = x_test.reshape(1, x_test.shape[0], x_test.shape[1])

print (x_train.shape)
print (y_train.shape)
print (x_test.shape)
print (y_test.shape)

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(None, x_train.shape[2]), stateful=True, batch_size=1))
model.add(LSTM(32, return_sequences=True, stateful=True))
model.add(LSTM(n_feats, return_sequences=True, stateful=True)) 

model.compile(loss='mse', optimizer='rmsprop')
model.fit(x_train, y_train, epochs=10, batch_size=1, verbose=2)  
y_predict = model.predict(x_test)
y_predict = y_predict.reshape(y_predict.shape[1], y_predict.shape[2])

y_predict = scaler.inverse_transform(y_predict)

y_test = scaler.inverse_transform(y_test)
y_test = y_test[:,0]
y_predict = y_predict[:,0]

print (y_test.shape)
print (y_predict.shape)

plt.plot(y_test, label='True')
plt.plot(y_predict,  label='Predict')
plt.legend()
plt.show()

然而，预测极差。如何提高预测素？你有什么改进的想法吗？

任何通过重新设计架构 and/or 层来改进预测的想法？

Answer 1

您可以考虑更换型号：

import numpy as np, pandas as pd, matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense, TimeDistributed, Bidirectional
from sklearn.metrics import mean_squared_error, accuracy_score
from scipy.stats import linregress
from sklearn.utils import shuffle

fi = 'pollution.csv'
raw = pd.read_csv(fi, delimiter=',')
raw = raw.drop('Dates', axis=1)
print (raw.shape)

scaler = MinMaxScaler(feature_range=(-1, 1))
raw = scaler.fit_transform(raw)

time_steps = 7
def create_ds(data, t_steps):
    data = pd.DataFrame(data)
    data_s = data.copy()
    for i in range(time_steps):
        data = pd.concat([data, data_s.shift(-(i+1))], axis = 1)   
    data.dropna(axis=0, inplace=True)
    return data.values

ds = create_ds(raw, time_steps)
print (ds.shape)
n_feats = raw.shape[1]
n_obs = time_steps * n_feats

n_rows = ds.shape[0]
train_size = int(n_rows * 0.8)

train_data = ds[:train_size, :]
train_data = shuffle(train_data)

test_data = ds[train_size:, :]

x_train = train_data[:, :n_obs]
y_train = train_data[:, n_obs:]
x_test = test_data[:, :n_obs]
y_test = test_data[:, n_obs:]

print (x_train.shape)
print (x_test.shape)
print (y_train.shape)
print (y_test.shape)

x_train = x_train.reshape(x_train.shape[0], time_steps, n_feats)
x_test = x_test.reshape(x_test.shape[0], time_steps, n_feats)

print (x_train.shape)
print (x_test.shape)
print (y_train.shape)
print (y_test.shape)

model = Sequential()
model.add(LSTM(64, input_shape=(time_steps, n_feats), return_sequences=True))
model.add(LSTM(32, return_sequences=False))
model.add(Dense(n_feats))

model.compile(loss='mse', optimizer='rmsprop')
model.fit(x_train, y_train, epochs=10, batch_size=1, verbose=1, shuffle=False)

y_predict = model.predict(x_test)
print (y_predict.shape)
y_predict = scaler.inverse_transform(y_predict)

y_test = scaler.inverse_transform(y_test)
y_test = y_test[:,0]
y_predict = y_predict[:,0]

print (y_test.shape)
print (y_predict.shape)

plt.plot(y_test, label='True')
plt.plot(y_predict,  label='Predict')
plt.legend()
plt.show()

但我真的不知道你实施的优点：

* both x and y are 3d (1,steps,features) rather than x in 3d (samples, time-steps, features) and y in 2d (samples, features)
* input_shape=(None, x_train.shape[2])
* last layer - model.add(LSTM(n_feats, return_sequences=True, stateful=True))

有人可能会提供更好的答案。

Answer 2

如果你想在我的代码中使用模型（你传递的link），你需要正确调整数据的形状：（1个序列，total_time_steps，5个特征）

重要提示：我不知道这是否是最好的方法或最好的模型，但这个模型预测输入之前的 7 个时间步（time_shift=7)

数据和初始变量

    fi = 'pollution.csv'
raw = pd.read_csv(fi, delimiter=',')
raw = raw.drop('Dates', axis=1)
print("raw shape:")
print (raw.shape)
#(1789,5) - 1789 time steps / 5 features

scaler = MinMaxScaler(feature_range=(-1, 1))
raw = scaler.fit_transform(raw)

time_shift = 7 #shift is the number of steps we are predicting ahead
n_rows = raw.shape[0] #n_rows is the number of time steps of our sequence
n_feats = raw.shape[1]
train_size = int(n_rows * 0.8)


#I couldn't understand how "ds" worked, so I simply removed it because in the code below it's not necessary

#getting the train part of the sequence
train_data = raw[:train_size, :] #first train_size steps, all 5 features
test_data = raw[train_size:, :] #I'll use the beginning of the data as state adjuster


#train_data = shuffle(train_data) !!!!!! we cannot shuffle time steps!!! we lose the sequence doing this

x_train = train_data[:-time_shift, :] #the entire train data, except the last shift steps 
x_test = test_data[:-time_shift,:] #the entire test data, except the last shift steps
x_predict = raw[:-time_shift,:] #the entire raw data, except the last shift steps

y_train = train_data[time_shift:, :] 
y_test = test_data[time_shift:,:]
y_predict_true = raw[time_shift:,:]

x_train = x_train.reshape(1, x_train.shape[0], x_train.shape[1]) #ok shape (1,steps,5) - 1 sequence, many steps, 5 features
y_train = y_train.reshape(1, y_train.shape[0], y_train.shape[1])
x_test = x_test.reshape(1, x_test.shape[0], x_test.shape[1])
y_test = y_test.reshape(1, y_test.shape[0], y_test.shape[1])
x_predict = x_predict.reshape(1, x_predict.shape[0], x_predict.shape[1])
y_predict_true = y_predict_true.reshape(1, y_predict_true.shape[0], y_predict_true.shape[1])

print("\nx_train:")
print (x_train.shape)
print("y_train")
print (y_train.shape)
print("x_test")
print (x_test.shape)
print("y_test")
print (y_test.shape)

型号

你的模型对于这个任务不是很强大，所以我尝试了一个更大的模型（另一方面这个太强大了）

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(None, x_train.shape[2])))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(256, return_sequences=True))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(n_feats, return_sequences=True)) 

model.compile(loss='mse', optimizer='adam')

合身

请注意，我必须训练 2000 多个时期才能使模型获得良好的结果。
我添加了验证数据，以便我们可以比较训练和测试的损失。

#notice that I'm predicting from the ENTIRE sequence, including x_train      
#is important for the model to adjust its states before predicting the end
model.fit(x_train, y_train, epochs=1000, batch_size=1, verbose=2, validation_data=(x_test,y_test))

预测

重要：至于根据开头预测序列的结尾，模型看到开头以调整内部状态很重要，所以我预测整个数据 (x_predict)，不仅是测试数据。

y_predict_model = model.predict(x_predict)

print("\ny_predict_true:")
print (y_predict_true.shape)
print("y_predict_model: ")
print (y_predict_model.shape)


def plot(true, predicted, divider):

    predict_plot = scaler.inverse_transform(predicted[0])
    true_plot = scaler.inverse_transform(true[0])

    predict_plot = predict_plot[:,0]
    true_plot = true_plot[:,0]

    plt.figure(figsize=(16,6))
    plt.plot(true_plot, label='True',linewidth=5)
    plt.plot(predict_plot,  label='Predict',color='y')

    if divider > 0:
        maxVal = max(true_plot.max(),predict_plot.max())
        minVal = min(true_plot.min(),predict_plot.min())

        plt.plot([divider,divider],[minVal,maxVal],label='train/test limit',color='k')

    plt.legend()
    plt.show()

test_size = n_rows - train_size
print("test length: " + str(test_size))

plot(y_predict_true,y_predict_model,train_size)
plot(y_predict_true[:,-2*test_size:],y_predict_model[:,-2*test_size:],test_size)

显示全部数据

显示它的结尾部分以获取更多详细信息

请注意这个模型过度拟合，这意味着它可以学习训练数据并在测试数据中得到不好的结果。

要解决这个问题，您必须通过实验尝试更小的模型，使用 dropout 层和其他技术来防止过度拟合。

另请注意，此数据很可能包含大量随机因素，这意味着模型无法从中学到任何有用的东西。当您制作较小的模型以避免过度拟合时，您可能还会发现该模型会对训练数据做出更差的预测。

找到完美的模型不是一件容易的事，这是一个悬而未决的问题，您必须进行试验。也许 LSTM 模型根本不是解决方案。也许您的数据根本无法预测，等等。对此没有明确的答案。

如何知道模型好

使用训练中的验证数据，您可以比较训练数据和测试数据的损失。

Train on 1 samples, validate on 1 samples
Epoch 1/1000
9s - loss: 0.4040 - val_loss: 0.3348
Epoch 2/1000
4s - loss: 0.3332 - val_loss: 0.2651
Epoch 3/1000
4s - loss: 0.2656 - val_loss: 0.2035
Epoch 4/1000
4s - loss: 0.2061 - val_loss: 0.1696
Epoch 5/1000
4s - loss: 0.1761 - val_loss: 0.1601
Epoch 6/1000
4s - loss: 0.1697 - val_loss: 0.1476
Epoch 7/1000
4s - loss: 0.1536 - val_loss: 0.1287
Epoch 8/1000
.....

两者应该一起下去。当测试数据停止下降，但训练数据继续改善时，您的模型开始过度拟合。

尝试其他模型

我能做的最好的（但我并没有真正尝试太多）是使用这个模型：

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(None, x_train.shape[2])))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(n_feats, return_sequences=True)) 

model.compile(loss='mse', optimizer='adam')

当损失约为：

loss: 0.0389 - val_loss: 0.0437

在这一点之后，验证损失开始上升（所以超过这一点的训练就完全没用了）

结果：

这表明该模型可以学习的是非常全面的行为，例如具有较高值的区域。

但是高频太随机或者模型不够好...

Answer 3

我不太确定你能做什么，数据看起来好像没有可辨别的模式。如果我看不到一个，我怀疑 LSTM 可以。不过，您的预测确实看起来像是一条很好的回归线。

Answer 4

阅读原始代码，似乎作者首先对数据集进行了缩放，然后将其拆分为训练和测试子集。这意味着有关测试子集的信息（例如波动率等）已 "leaked" 进入训练子集。

推荐的方法是先进行Training/Testing拆分，只使用Training子集计算缩放参数，然后使用这些参数分别对Training和Testing子集进行缩放。

Answer 5

我自己正在创建一个模型来预测这样的数据我创建了一个 SMOTErnn 灵魂作为过去的数据添加，我发现在 batch_size 上使用 TimeSeriesGenrator 更高，步幅更大，它表现得更好更好。

极差的预测：LSTM 时间序列

Extremely poor prediction: LSTM time-series

python

deep-learning

keras

tensorflow

keras-layer

数据和初始变量

型号

合身

预测

显示全部数据

显示它的结尾部分以获取更多详细信息

如何知道模型好

尝试其他模型