嵌入层和 lstm 编码器层之间的维度不匹配
Dimensions between embedding layer and lstm encoder layer don't match
我正在尝试构建用于文本生成的编码器-解码器模型。我正在使用带有嵌入层的 LSTM 层。我在嵌入层到 LSTM 编码器层的输出时遇到了某种问题。我得到的错误是:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 13, 128, 512)
我的编码器数据有形状:(40, 13, 128) = (num_observations, max_encoder_seq_length, vocab_size)
embeddings_size/latent_dim = 512.
我的问题是:如何从嵌入层到 LSTM 编码器层“摆脱”这第 4 个维度,或者换句话说:我应该如何将这 4 个维度传递到 LSTM 层编码器型号?由于我是这个主题的新手,我最终还应该在解码器 LSTM 层中纠正什么?
我已经阅读了几篇文章,包括 , and this 和许多其他文章,但找不到解决方案。在我看来,我的问题不在于模型,而在于数据的形式。任何关于可能出错的提示或评论都将不胜感激。非常感谢
我的模型来自 (this tutorial):
encoder_inputs = Input(shape=(max_encoder_seq_length,))
x = Embedding(num_encoder_tokens, latent_dim)(encoder_inputs)
x, state_h, state_c = LSTM(latent_dim, return_state=True)(x)
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(max_decoder_seq_length,))
x = Embedding(num_decoder_tokens, latent_dim)(decoder_inputs)
x = LSTM(latent_dim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(num_decoder_tokens, activation='softmax')(x)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()
# Compile & run training
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# Note that `decoder_target_data` needs to be one-hot encoded,
# rather than sequences of integers like `decoder_input_data`!
model.fit([encoder_input_data, decoder_input_data],
decoder_target_data,
batch_size=batch_size,
epochs=epochs,
shuffle=True,
validation_split=0.05)
我的模型总结如下:
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 13)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 15)] 0
__________________________________________________________________________________________________
embedding (Embedding) (None, 13, 512) 65536 input_1[0][0]
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 15, 512) 65536 input_2[0][0]
__________________________________________________________________________________________________
lstm (LSTM) [(None, 512), (None, 2099200 embedding[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 15, 512) 2099200 embedding_1[0][0]
lstm[0][1]
lstm[0][2]
__________________________________________________________________________________________________
dense (Dense) (None, 15, 128) 65664 lstm_1[0][0]
==================================================================================================
Total params: 4,395,136
Trainable params: 4,395,136
Non-trainable params: 0
__________________________________________________________________________________________________
编辑
我正在按以下方式格式化我的数据:
for i, text, in enumerate(input_texts):
words = text.split() #text is a sentence
for t, word in enumerate(words):
encoder_input_data[i, t, input_dict[word]] = 1.
给出这样的命令 decoder_input_data[:2]
:
array([[[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]],
[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]], dtype=float32)
我不确定您将什么作为输入和输出传递给模式,但这就是有效的。请注意我传递的 encoder
和 decoder
输入的形状。您的输入需要采用该形状才能使模型 运行。
### INITIAL CONFIGURATION
num_observations = 40
max_encoder_seq_length = 13
max_decoder_seq_length = 15
num_encoder_tokens = 128
num_decoder_tokens = 128
latent_dim = 512
batch_size = 256
epochs = 5
### MODEL DEFINITION
encoder_inputs = Input(shape=(max_encoder_seq_length,))
x = Embedding(num_encoder_tokens, latent_dim)(encoder_inputs)
x, state_h, state_c = LSTM(latent_dim, return_state=True)(x)
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(max_decoder_seq_length,))
x = Embedding(num_decoder_tokens, latent_dim)(decoder_inputs)
x = LSTM(latent_dim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(num_decoder_tokens, activation='softmax')(x)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
### MODEL INPUT AND OUTPUT SHAPES
encoder_input_data = np.random.random((1000,13))
decoder_input_data = np.random.random((1000,15))
decoder_target_data = np.random.random((1000, 15, 128))
model.fit([encoder_input_data, decoder_input_data],
decoder_target_data,
batch_size=batch_size,
epochs=epochs,
shuffle=True,
validation_split=0.05)
Model: "functional_210"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_176 (InputLayer) [(None, 13)] 0
__________________________________________________________________________________________________
input_177 (InputLayer) [(None, 15)] 0
__________________________________________________________________________________________________
embedding_33 (Embedding) (None, 13, 512) 65536 input_176[0][0]
__________________________________________________________________________________________________
embedding_34 (Embedding) (None, 15, 512) 65536 input_177[0][0]
__________________________________________________________________________________________________
lstm_94 (LSTM) [(None, 512), (None, 2099200 embedding_33[0][0]
__________________________________________________________________________________________________
lstm_95 (LSTM) (None, 15, 512) 2099200 embedding_34[0][0]
lstm_94[0][1]
lstm_94[0][2]
__________________________________________________________________________________________________
dense_95 (Dense) (None, 15, 128) 65664 lstm_95[0][0]
==================================================================================================
Total params: 4,395,136
Trainable params: 4,395,136
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 1/5
4/4 [==============================] - 3s 853ms/step - loss: 310.7389 - val_loss: 310.3570
Epoch 2/5
4/4 [==============================] - 3s 638ms/step - loss: 310.6186 - val_loss: 310.3362
Epoch 3/5
4/4 [==============================] - 3s 852ms/step - loss: 310.6126 - val_loss: 310.3345
Epoch 4/5
4/4 [==============================] - 3s 797ms/step - loss: 310.6111 - val_loss: 310.3369
Epoch 5/5
4/4 [==============================] - 3s 872ms/step - loss: 310.6117 - val_loss: 310.3352
序列数据(文本)需要作为标签编码序列传递到输入。这需要通过使用来自 keras 的 textvectorizer
之类的东西来完成。请详细阅读 how to prepare text data for embedding layers and lstms here。
我正在尝试构建用于文本生成的编码器-解码器模型。我正在使用带有嵌入层的 LSTM 层。我在嵌入层到 LSTM 编码器层的输出时遇到了某种问题。我得到的错误是:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 13, 128, 512)
我的编码器数据有形状:(40, 13, 128) = (num_observations, max_encoder_seq_length, vocab_size)
embeddings_size/latent_dim = 512.
我的问题是:如何从嵌入层到 LSTM 编码器层“摆脱”这第 4 个维度,或者换句话说:我应该如何将这 4 个维度传递到 LSTM 层编码器型号?由于我是这个主题的新手,我最终还应该在解码器 LSTM 层中纠正什么?
我已经阅读了几篇文章,包括
我的模型来自 (this tutorial):
encoder_inputs = Input(shape=(max_encoder_seq_length,))
x = Embedding(num_encoder_tokens, latent_dim)(encoder_inputs)
x, state_h, state_c = LSTM(latent_dim, return_state=True)(x)
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(max_decoder_seq_length,))
x = Embedding(num_decoder_tokens, latent_dim)(decoder_inputs)
x = LSTM(latent_dim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(num_decoder_tokens, activation='softmax')(x)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()
# Compile & run training
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# Note that `decoder_target_data` needs to be one-hot encoded,
# rather than sequences of integers like `decoder_input_data`!
model.fit([encoder_input_data, decoder_input_data],
decoder_target_data,
batch_size=batch_size,
epochs=epochs,
shuffle=True,
validation_split=0.05)
我的模型总结如下:
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 13)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 15)] 0
__________________________________________________________________________________________________
embedding (Embedding) (None, 13, 512) 65536 input_1[0][0]
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 15, 512) 65536 input_2[0][0]
__________________________________________________________________________________________________
lstm (LSTM) [(None, 512), (None, 2099200 embedding[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 15, 512) 2099200 embedding_1[0][0]
lstm[0][1]
lstm[0][2]
__________________________________________________________________________________________________
dense (Dense) (None, 15, 128) 65664 lstm_1[0][0]
==================================================================================================
Total params: 4,395,136
Trainable params: 4,395,136
Non-trainable params: 0
__________________________________________________________________________________________________
编辑
我正在按以下方式格式化我的数据:
for i, text, in enumerate(input_texts):
words = text.split() #text is a sentence
for t, word in enumerate(words):
encoder_input_data[i, t, input_dict[word]] = 1.
给出这样的命令 decoder_input_data[:2]
:
array([[[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]],
[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]], dtype=float32)
我不确定您将什么作为输入和输出传递给模式,但这就是有效的。请注意我传递的 encoder
和 decoder
输入的形状。您的输入需要采用该形状才能使模型 运行。
### INITIAL CONFIGURATION
num_observations = 40
max_encoder_seq_length = 13
max_decoder_seq_length = 15
num_encoder_tokens = 128
num_decoder_tokens = 128
latent_dim = 512
batch_size = 256
epochs = 5
### MODEL DEFINITION
encoder_inputs = Input(shape=(max_encoder_seq_length,))
x = Embedding(num_encoder_tokens, latent_dim)(encoder_inputs)
x, state_h, state_c = LSTM(latent_dim, return_state=True)(x)
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(max_decoder_seq_length,))
x = Embedding(num_decoder_tokens, latent_dim)(decoder_inputs)
x = LSTM(latent_dim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(num_decoder_tokens, activation='softmax')(x)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
### MODEL INPUT AND OUTPUT SHAPES
encoder_input_data = np.random.random((1000,13))
decoder_input_data = np.random.random((1000,15))
decoder_target_data = np.random.random((1000, 15, 128))
model.fit([encoder_input_data, decoder_input_data],
decoder_target_data,
batch_size=batch_size,
epochs=epochs,
shuffle=True,
validation_split=0.05)
Model: "functional_210"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_176 (InputLayer) [(None, 13)] 0
__________________________________________________________________________________________________
input_177 (InputLayer) [(None, 15)] 0
__________________________________________________________________________________________________
embedding_33 (Embedding) (None, 13, 512) 65536 input_176[0][0]
__________________________________________________________________________________________________
embedding_34 (Embedding) (None, 15, 512) 65536 input_177[0][0]
__________________________________________________________________________________________________
lstm_94 (LSTM) [(None, 512), (None, 2099200 embedding_33[0][0]
__________________________________________________________________________________________________
lstm_95 (LSTM) (None, 15, 512) 2099200 embedding_34[0][0]
lstm_94[0][1]
lstm_94[0][2]
__________________________________________________________________________________________________
dense_95 (Dense) (None, 15, 128) 65664 lstm_95[0][0]
==================================================================================================
Total params: 4,395,136
Trainable params: 4,395,136
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 1/5
4/4 [==============================] - 3s 853ms/step - loss: 310.7389 - val_loss: 310.3570
Epoch 2/5
4/4 [==============================] - 3s 638ms/step - loss: 310.6186 - val_loss: 310.3362
Epoch 3/5
4/4 [==============================] - 3s 852ms/step - loss: 310.6126 - val_loss: 310.3345
Epoch 4/5
4/4 [==============================] - 3s 797ms/step - loss: 310.6111 - val_loss: 310.3369
Epoch 5/5
4/4 [==============================] - 3s 872ms/step - loss: 310.6117 - val_loss: 310.3352
序列数据(文本)需要作为标签编码序列传递到输入。这需要通过使用来自 keras 的 textvectorizer
之类的东西来完成。请详细阅读 how to prepare text data for embedding layers and lstms here。