尝试使用 SPY 数据训练简单 LSTM 后出错
Error after attempting to train simple LSTM with SPY data
我认为这些错误与我的数据格式或我的代码与数据集交互的方式有关,但无论如何我都不是开发人员,所以我不是真的确定到底发生了什么。
/Users/kylehammerberg/PycharmProjects/LSTM1P/matplottest.py:54: VisibleDeprecationWarning: 从参差不齐的嵌套序列(它是列表或元组的列表或元组或具有不同长度或形状的 ndarrays)创建 ndarray 已弃用.如果你打算这样做,你必须在创建 ndarray 时指定 'dtype=object'
X_test = np.array(X_test)
追溯(最近一次通话):
文件“/Users/kylehammerberg/PycharmProjects/LSTM1P/matplottest.py”,第 55 行,位于
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
IndexError:元组索引超出范围
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import keras
url = 'https://raw.githubusercontent.com/khammerberg53/MLPROJ1/main/SP500.csv'
dataset_train = pd.read_csv(url)
training_set = dataset_train.iloc[:, 1:2].values
dataset_train.head()
print(dataset_train.head())
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)
X_train = []
y_train = []
for i in range(60, 2000):
X_train.append(training_set_scaled[i-60:i, 0])
y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Dense
model = Sequential()
model.add(LSTM(units=50,return_sequences=True,input_shape=(X_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=50,return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50,return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units=1))
model.compile(optimizer='adam',loss='mean_squared_error')
model.fit(X_train,y_train,epochs=100,batch_size=32)
url = 'https://raw.githubusercontent.com/khammerberg53/MLPROJ1/main/SP500%20test%20setcsv.csv'
dataset_test = pd.read_csv(url)
real_stock_price = dataset_test.iloc[:, 1:2].values
dataset_total = pd.concat((dataset_train['Value'], dataset_test['Value']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(3, 100):
X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_stock_price = model.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
plt.plot(real_stock_price, color = 'black', label = 'TATA Stock Price')
plt.plot(predicted_stock_price, color = 'green', label = 'Predicted TATA Stock Price')
plt.title('TATA Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('TATA Stock Price')
plt.legend()
plt.show()
print(plt.show())
如果您按自己的方式定义 X_test
,则范围不能从 3 到 100。如果您将代码更改为:
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 161):
X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
代码的其余部分将产生(我只用了 2 个 epoch,可能会解释预测不是你所期望的):
20 个时期,你会得到这个:
我认为这些错误与我的数据格式或我的代码与数据集交互的方式有关,但无论如何我都不是开发人员,所以我不是真的确定到底发生了什么。
/Users/kylehammerberg/PycharmProjects/LSTM1P/matplottest.py:54: VisibleDeprecationWarning: 从参差不齐的嵌套序列(它是列表或元组的列表或元组或具有不同长度或形状的 ndarrays)创建 ndarray 已弃用.如果你打算这样做,你必须在创建 ndarray 时指定 'dtype=object' X_test = np.array(X_test) 追溯(最近一次通话): 文件“/Users/kylehammerberg/PycharmProjects/LSTM1P/matplottest.py”,第 55 行,位于 X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1)) IndexError:元组索引超出范围
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import keras
url = 'https://raw.githubusercontent.com/khammerberg53/MLPROJ1/main/SP500.csv'
dataset_train = pd.read_csv(url)
training_set = dataset_train.iloc[:, 1:2].values
dataset_train.head()
print(dataset_train.head())
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)
X_train = []
y_train = []
for i in range(60, 2000):
X_train.append(training_set_scaled[i-60:i, 0])
y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Dense
model = Sequential()
model.add(LSTM(units=50,return_sequences=True,input_shape=(X_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=50,return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50,return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units=1))
model.compile(optimizer='adam',loss='mean_squared_error')
model.fit(X_train,y_train,epochs=100,batch_size=32)
url = 'https://raw.githubusercontent.com/khammerberg53/MLPROJ1/main/SP500%20test%20setcsv.csv'
dataset_test = pd.read_csv(url)
real_stock_price = dataset_test.iloc[:, 1:2].values
dataset_total = pd.concat((dataset_train['Value'], dataset_test['Value']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(3, 100):
X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_stock_price = model.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
plt.plot(real_stock_price, color = 'black', label = 'TATA Stock Price')
plt.plot(predicted_stock_price, color = 'green', label = 'Predicted TATA Stock Price')
plt.title('TATA Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('TATA Stock Price')
plt.legend()
plt.show()
print(plt.show())
如果您按自己的方式定义 X_test
,则范围不能从 3 到 100。如果您将代码更改为:
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 161):
X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
代码的其余部分将产生(我只用了 2 个 epoch,可能会解释预测不是你所期望的):
20 个时期,你会得到这个: