LSTM occurs ValueError: Shapes (5, 2, 3) and (5, 3) are incompatible
LSTM occurs ValueError: Shapes (5, 2, 3) and (5, 3) are incompatible
我想用时间序列数据做时间序列多class class化。这里我得到的数据集需要大量预处理,只是为了了解如何实现我使用 IRIS 数据集(不适用于 LSTM)的模型,因为它具有与时间序列数据完全相同的结构我有(4 个输入特征,1 个输出特征,120 个样本)。我实现了以下代码,但是在使用批量大小 5 拟合模型时它会导致我出现无效形状错误(多次更改批量大小但似乎没有进行任何更改)
#load dataset
dataframe = pandas.read_csv("iris.csv",header=None)
dataset = dataframe.values
X=dataset[:,0:4].astype(float)
Y=dataset[:,4]
# Encode the output variables
encoder = LabelEncoder()
encoder.fit(Y)
# convert output variables into the numbers
encoded_Y = encoder.transform(Y)
# Convert integers to dummy variables (one-hot encoded)
dummy_Y = np_utils.to_categorical(encoded_Y)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,dummy_Y,test_size=0.2) #20% is allocated for the testing
X_train = X_train.reshape(60, 2, 4)
y_train = y_train.reshape(60, 2, 3)
y_train.shape,X_train.shape
((60, 2, 3), (60, 2, 4))
# Create the Neural Network Model
def create_nn_model():
#create sequential model
model = Sequential()
model.add(LSTM(100,dropout=0.2, input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(Dense(100, activation='relu'))
model.add(Dense(3,activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
return model
model = create_nn_model()
model.summary()
> Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 100) 42000
_________________________________________________________________
dense_2 (Dense) (None, 100) 10100
_________________________________________________________________
dense_3 (Dense) (None, 3) 303
=================================================================
Total params: 52,403
Trainable params: 52,403
Non-trainable params: 0
model.fit(X_train,y_train,epochs=200,batch_size=5)
> ValueError Traceback (most recent call last)
<ipython-input-26-0aef33c299f0> in <module>()
----> 1 model.fit(X_train,y_train,epochs=200,batch_size=5) #X_train is independant variables. based on the amount of the data set data set will be trained by breaking into batches
9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
984 except Exception as e: # pylint:disable=broad-except
985 if hasattr(e, "ag_error_metadata"):
--> 986 raise e.ag_error_metadata.to_exception(e)
987 else:
988 raise
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:830 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:813 run_step *
outputs = model.train_step(data)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:771 train_step *
loss = self.compiled_loss(
/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py:201 __call__ *
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
/usr/local/lib/python3.7/dist-packages/keras/losses.py:142 __call__ *
losses = call_fn(y_true, y_pred)
/usr/local/lib/python3.7/dist-packages/keras/losses.py:246 call *
return ag_fn(y_true, y_pred, **self._fn_kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper **
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/losses.py:1631 categorical_crossentropy
y_true, y_pred, from_logits=from_logits)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/backend.py:4827 categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/tensor_shape.py:1161 assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (5, 2, 3) and (5, 3) are incompatible
你的y_true
和y_pred
不在同一个形状。您可能需要按以下方式定义您的 LSTM
model.add(LSTM(100,dropout=0.2, input_shape=(2,4), return_sequences=True))
....
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
....
dense_3 (Dense) (None, 2, 3) 303 < ---
=================================================================
更新
使用 return_sequences = True
会起作用,因为您以这种方式定义了您的 Training-Paris:
X_train = X_train.reshape(60, 2, 4)
y_train = y_train.reshape(60, 2, 3)
代表(batch_size, timestep, input_lenght)
;但请注意,您需要重塑或满足上述模型中 LSTM 层的输入要求,而不是 y_train
。但是,当您定义模型时,您不使用 return 序列,它使最后一层只有三个没有时间步长的分类器,但您的 y_train
是以这种方式定义的。但是,如果将 return 序列设置为 True 并绘制模型摘要,您会看到最后一层的输出形状为 (None, 2, 3
),与 y_train
的形状完全匹配.
在了解 return_sequence
在这里做什么之前,您可能需要了解时间步长在 LSTM 模型中的含义,请查看 answer. AFAIK, it depends on how many timesteps you need to set for your input; I can make a single occurrence of the LSTM cell or multiple times (n-th
timestep). And for n-th
timestep (n: {1,2,3..N
), if I want from LSTM to return all timestep output (n
numbers), then I will set return_sequence = True
, but else return_sequence = False
. From doc、
return_sequences: Boolean. Whether to return the last output. in the output sequence, or the full sequence. Default: False.
简而言之,如果设置为 True,则所有序列都会 return,但如果设置为 False,则只有最后一个输出会。例如:
inputs = tf.random.normal([32, 8])
inputs = tf.reshape(inputs, [-1, 2, 4 ]) # or [-1, 4, 2] # or [-1, 1, 8]
inputs.shape
TensorShape([32, 2, 4]) # (batch_size, timestep, input_length)
lstm = tf.keras.layers.LSTM(10, return_sequences=True)
whole_seq_output = lstm(inputs)
print(whole_seq_output.shape)
(32, 2, 10) # (batch_size, timestep, output_length)
lstm = tf.keras.layers.LSTM(10, return_sequences=False)
last_seq_output = lstm(inputs)
print(last_seq_output.shape)
(32, 10) # (batch_size, output_length)
这是对上述代码的一种单向方法。虹膜数据取自 here.
import pandas
dataframe = pandas.read_csv("/content/iris.csv")
dataframe.head(3)
sepal.length sepal.width petal.length petal.width variety
0 5.1 3.5 1.4 0.2 Setosa
1 4.9 3.0 1.4 0.2 Setosa
2 4.7 3.2 1.3 0.2 Setosa
dataframe.variety.unique()
array(['Setosa', 'Versicolor', 'Virginica'], dtype=object)
target_map = dict(zip(list(dataframe['variety'].unique()),
([0, 1, 2])))
target_map
{'Setosa': 0, 'Versicolor': 1, 'Virginica': 2}
dataframe['target'] = dataframe.variety.map(target_map)
dataframe.sample()
sepal.length sepal.width petal.length petal.width variety target
128 6.4 2.8 5.6 2.1 Virginica 2
X = dataframe.iloc[:, :4]
Y = dataframe.iloc[:, 5]
X.shape, Y.shape
((150, 4), (150,))
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
OHE_Y = to_categorical(Y, num_classes=3)
X_train, X_test, y_train, y_test = train_test_split(X, OHE_Y,
test_size=0.2)
X_train.shape
(120, 4)
# make it lstm compatible input
X_train = X_train.values.reshape(-1, 1, 4)
X_train.shape ,y_train.shape
((120, 1, 4), (120, 3))
型号
from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM, Dense
def create_nn_model():
model = Sequential()
model.add(LSTM(100, dropout=0.2, input_shape=(X_train.shape[1],
X_train.shape[2])))
model.add(Dense(100, activation='relu'))
model.add(Dense(3,activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
return model
model = create_nn_model()
model.summary()
model.fit(X_train, y_train, epochs=10,batch_size=5)
...
Epoch 9/10
3ms/step - loss: 0.5224 - accuracy: 0.7243
Epoch 10/10
3ms/step - loss: 0.5568 - accuracy: 0.7833
推理
model.evaluate(X_train, y_train)
4ms/step - loss: 0.3843 - accuracy: 0.9583
[0.38432881236076355, 0.9583333134651184]
y_pred = model.predict(X_train).argmax(-1)
y_pred
array([2, 1, 1, 1, 1, 2, 2, 0, 1, 2, 2, 2, 0, 1, 1, 1, 0, 1, 0, 0, 2, 0,
0, 2, 2, 0, 0, 2, 0, 0, 1, 0, 0, 1, 0, 2, 2, 0, 2, 2, 0, 2, 0, 0,
1, 1, 2, 0, 1, 2, 1, 2, 0, 0, 2, 2, 2, 0, 0, 0, 2, 2, 2, 0, 0, 0,
2, 2, 0, 2, 1, 0, 2, 1, 0, 0, 0, 1, 1, 1, 0, 2, 2, 1, 1, 0, 2, 0,
0, 2, 1, 0, 2, 1, 1, 1, 1, 2, 1, 0, 1, 2, 1, 1, 2, 1, 1, 1, 2, 2,
0, 1, 2, 1, 0, 0, 2, 1, 2, 0])
我想用时间序列数据做时间序列多class class化。这里我得到的数据集需要大量预处理,只是为了了解如何实现我使用 IRIS 数据集(不适用于 LSTM)的模型,因为它具有与时间序列数据完全相同的结构我有(4 个输入特征,1 个输出特征,120 个样本)。我实现了以下代码,但是在使用批量大小 5 拟合模型时它会导致我出现无效形状错误(多次更改批量大小但似乎没有进行任何更改)
#load dataset
dataframe = pandas.read_csv("iris.csv",header=None)
dataset = dataframe.values
X=dataset[:,0:4].astype(float)
Y=dataset[:,4]
# Encode the output variables
encoder = LabelEncoder()
encoder.fit(Y)
# convert output variables into the numbers
encoded_Y = encoder.transform(Y)
# Convert integers to dummy variables (one-hot encoded)
dummy_Y = np_utils.to_categorical(encoded_Y)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,dummy_Y,test_size=0.2) #20% is allocated for the testing
X_train = X_train.reshape(60, 2, 4)
y_train = y_train.reshape(60, 2, 3)
y_train.shape,X_train.shape
((60, 2, 3), (60, 2, 4))
# Create the Neural Network Model
def create_nn_model():
#create sequential model
model = Sequential()
model.add(LSTM(100,dropout=0.2, input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(Dense(100, activation='relu'))
model.add(Dense(3,activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
return model
model = create_nn_model()
model.summary()
> Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 100) 42000
_________________________________________________________________
dense_2 (Dense) (None, 100) 10100
_________________________________________________________________
dense_3 (Dense) (None, 3) 303
=================================================================
Total params: 52,403
Trainable params: 52,403
Non-trainable params: 0
model.fit(X_train,y_train,epochs=200,batch_size=5)
> ValueError Traceback (most recent call last)
<ipython-input-26-0aef33c299f0> in <module>()
----> 1 model.fit(X_train,y_train,epochs=200,batch_size=5) #X_train is independant variables. based on the amount of the data set data set will be trained by breaking into batches
9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
984 except Exception as e: # pylint:disable=broad-except
985 if hasattr(e, "ag_error_metadata"):
--> 986 raise e.ag_error_metadata.to_exception(e)
987 else:
988 raise
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:830 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:813 run_step *
outputs = model.train_step(data)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:771 train_step *
loss = self.compiled_loss(
/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py:201 __call__ *
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
/usr/local/lib/python3.7/dist-packages/keras/losses.py:142 __call__ *
losses = call_fn(y_true, y_pred)
/usr/local/lib/python3.7/dist-packages/keras/losses.py:246 call *
return ag_fn(y_true, y_pred, **self._fn_kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper **
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/losses.py:1631 categorical_crossentropy
y_true, y_pred, from_logits=from_logits)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/backend.py:4827 categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/tensor_shape.py:1161 assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (5, 2, 3) and (5, 3) are incompatible
你的y_true
和y_pred
不在同一个形状。您可能需要按以下方式定义您的 LSTM
model.add(LSTM(100,dropout=0.2, input_shape=(2,4), return_sequences=True))
....
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
....
dense_3 (Dense) (None, 2, 3) 303 < ---
=================================================================
更新
使用 return_sequences = True
会起作用,因为您以这种方式定义了您的 Training-Paris:
X_train = X_train.reshape(60, 2, 4)
y_train = y_train.reshape(60, 2, 3)
代表(batch_size, timestep, input_lenght)
;但请注意,您需要重塑或满足上述模型中 LSTM 层的输入要求,而不是 y_train
。但是,当您定义模型时,您不使用 return 序列,它使最后一层只有三个没有时间步长的分类器,但您的 y_train
是以这种方式定义的。但是,如果将 return 序列设置为 True 并绘制模型摘要,您会看到最后一层的输出形状为 (None, 2, 3
),与 y_train
的形状完全匹配.
在了解 return_sequence
在这里做什么之前,您可能需要了解时间步长在 LSTM 模型中的含义,请查看 n-th
timestep). And for n-th
timestep (n: {1,2,3..N
), if I want from LSTM to return all timestep output (n
numbers), then I will set return_sequence = True
, but else return_sequence = False
. From doc、
return_sequences: Boolean. Whether to return the last output. in the output sequence, or the full sequence. Default: False.
简而言之,如果设置为 True,则所有序列都会 return,但如果设置为 False,则只有最后一个输出会。例如:
inputs = tf.random.normal([32, 8])
inputs = tf.reshape(inputs, [-1, 2, 4 ]) # or [-1, 4, 2] # or [-1, 1, 8]
inputs.shape
TensorShape([32, 2, 4]) # (batch_size, timestep, input_length)
lstm = tf.keras.layers.LSTM(10, return_sequences=True)
whole_seq_output = lstm(inputs)
print(whole_seq_output.shape)
(32, 2, 10) # (batch_size, timestep, output_length)
lstm = tf.keras.layers.LSTM(10, return_sequences=False)
last_seq_output = lstm(inputs)
print(last_seq_output.shape)
(32, 10) # (batch_size, output_length)
这是对上述代码的一种单向方法。虹膜数据取自 here.
import pandas
dataframe = pandas.read_csv("/content/iris.csv")
dataframe.head(3)
sepal.length sepal.width petal.length petal.width variety
0 5.1 3.5 1.4 0.2 Setosa
1 4.9 3.0 1.4 0.2 Setosa
2 4.7 3.2 1.3 0.2 Setosa
dataframe.variety.unique()
array(['Setosa', 'Versicolor', 'Virginica'], dtype=object)
target_map = dict(zip(list(dataframe['variety'].unique()),
([0, 1, 2])))
target_map
{'Setosa': 0, 'Versicolor': 1, 'Virginica': 2}
dataframe['target'] = dataframe.variety.map(target_map)
dataframe.sample()
sepal.length sepal.width petal.length petal.width variety target
128 6.4 2.8 5.6 2.1 Virginica 2
X = dataframe.iloc[:, :4]
Y = dataframe.iloc[:, 5]
X.shape, Y.shape
((150, 4), (150,))
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
OHE_Y = to_categorical(Y, num_classes=3)
X_train, X_test, y_train, y_test = train_test_split(X, OHE_Y,
test_size=0.2)
X_train.shape
(120, 4)
# make it lstm compatible input
X_train = X_train.values.reshape(-1, 1, 4)
X_train.shape ,y_train.shape
((120, 1, 4), (120, 3))
型号
from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM, Dense
def create_nn_model():
model = Sequential()
model.add(LSTM(100, dropout=0.2, input_shape=(X_train.shape[1],
X_train.shape[2])))
model.add(Dense(100, activation='relu'))
model.add(Dense(3,activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
return model
model = create_nn_model()
model.summary()
model.fit(X_train, y_train, epochs=10,batch_size=5)
...
Epoch 9/10
3ms/step - loss: 0.5224 - accuracy: 0.7243
Epoch 10/10
3ms/step - loss: 0.5568 - accuracy: 0.7833
推理
model.evaluate(X_train, y_train)
4ms/step - loss: 0.3843 - accuracy: 0.9583
[0.38432881236076355, 0.9583333134651184]
y_pred = model.predict(X_train).argmax(-1)
y_pred
array([2, 1, 1, 1, 1, 2, 2, 0, 1, 2, 2, 2, 0, 1, 1, 1, 0, 1, 0, 0, 2, 0,
0, 2, 2, 0, 0, 2, 0, 0, 1, 0, 0, 1, 0, 2, 2, 0, 2, 2, 0, 2, 0, 0,
1, 1, 2, 0, 1, 2, 1, 2, 0, 0, 2, 2, 2, 0, 0, 0, 2, 2, 2, 0, 0, 0,
2, 2, 0, 2, 1, 0, 2, 1, 0, 0, 0, 1, 1, 1, 0, 2, 2, 1, 1, 0, 2, 0,
0, 2, 1, 0, 2, 1, 1, 1, 1, 2, 1, 0, 1, 2, 1, 1, 2, 1, 1, 1, 2, 2,
0, 1, 2, 1, 0, 0, 2, 1, 2, 0])