指定 seq2seq 自动编码器。 RepeatVector 有什么作用?批量学习对预测输出有什么影响?
Specifying a seq2seq autoencoder. What does RepeatVector do? And what is the effect of batch learning on predicting output?
我正在构建一个基本的 seq2seq 自动编码器,但我不确定我做的是否正确。
model = Sequential()
# Encoder
model.add(LSTM(32, activation='relu', input_shape =(timesteps, n_features ), return_sequences=True))
model.add(LSTM(16, activation='relu', return_sequences=False))
model.add(RepeatVector(timesteps))
# Decoder
model.add(LSTM(16, activation='relu', return_sequences=True))
model.add(LSTM(32, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))'''
然后使用批量大小参数拟合模型
model.fit(data, data,
epochs=30,
batch_size = 32)
该模型是用 mse
损失函数编译的,似乎可以学习。
为了获得测试数据的编码器输出,我使用了 K 函数:
get_encoder_output = K.function([model.layers[0].input],
[model.layers[1].output])
encoder_output = get_encoder_output([test_data])[0]
我的第一个问题是模型指定是否正确。特别是是否需要 RepeatVector 层。我不确定它在做什么。如果我省略它并用return_sequences = True
指定前一层怎么办?
我的第二个问题是我是否需要告诉get_encoder_output
训练中使用的batch_size
?
提前感谢您对这两个问题的任何帮助。
在我看来,在 Keras 中实现 seq2seq LSTM 的最佳方法是使用 2 个 LSTM 模型并将第一个模型将其状态转移到第二个模型。
编码器中的最后一个 LSTM 层将需要
return_state=True ,return_sequences=False
所以它将传递它的 h
和 c
.
然后您需要设置一个 LSTM 解码器来接收这些 initial_state
。
对于解码器输入,您很可能希望将 "start of sequence" 标记作为第一个时间步长的输入,然后使用 nth
时间步长的解码器输出作为解码器的输入(n+1)th
时间步。
掌握了这些,再看看逼老师。
This 可能对您有用:
作为一个玩具问题,我创建了一个 seq2seq 模型来预测不同正弦波的连续性。
这是模特:
def create_seq2seq():
features_num=5
latent_dim=40
##
encoder_inputs = Input(shape=(None, features_num))
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoder_inputs)
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoded)
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoded)
encoded = LSTM(latent_dim, return_state=True)(encoded)
encoder = Model (input=encoder_inputs, output=encoded)
##
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs=Input(shape=(1, features_num))
decoder_lstm_1 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_2 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_3 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_4 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_dense = Dense(features_num)
all_outputs = []
inputs = decoder_inputs
states_1=encoder_states
# Placeholder values:
states_2=states_1; states_3=states_1; states_4=states_1
###
for _ in range(1):
# Run the decoder on the first timestep
outputs_1, state_h_1, state_c_1 = decoder_lstm_1(inputs, initial_state=states_1)
outputs_2, state_h_2, state_c_2 = decoder_lstm_2(outputs_1)
outputs_3, state_h_3, state_c_3 = decoder_lstm_3(outputs_2)
outputs_4, state_h_4, state_c_4 = decoder_lstm_4(outputs_3)
# Store the current prediction (we will concatenate all predictions later)
outputs = decoder_dense(outputs_4)
all_outputs.append(outputs)
# Reinject the outputs as inputs for the next loop iteration
# as well as update the states
inputs = outputs
states_1 = [state_h_1, state_c_1]
states_2 = [state_h_2, state_c_2]
states_3 = [state_h_3, state_c_3]
states_4 = [state_h_4, state_c_4]
for _ in range(149):
# Run the decoder on each timestep
outputs_1, state_h_1, state_c_1 = decoder_lstm_1(inputs, initial_state=states_1)
outputs_2, state_h_2, state_c_2 = decoder_lstm_2(outputs_1, initial_state=states_2)
outputs_3, state_h_3, state_c_3 = decoder_lstm_3(outputs_2, initial_state=states_3)
outputs_4, state_h_4, state_c_4 = decoder_lstm_4(outputs_3, initial_state=states_4)
# Store the current prediction (we will concatenate all predictions later)
outputs = decoder_dense(outputs_4)
all_outputs.append(outputs)
# Reinject the outputs as inputs for the next loop iteration
# as well as update the states
inputs = outputs
states_1 = [state_h_1, state_c_1]
states_2 = [state_h_2, state_c_2]
states_3 = [state_h_3, state_c_3]
states_4 = [state_h_4, state_c_4]
# Concatenate all predictions
decoder_outputs = Lambda(lambda x: K.concatenate(x, axis=1))(all_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
#model = load_model('pre_model.h5')
print(model.summary()
return (model)
我正在构建一个基本的 seq2seq 自动编码器,但我不确定我做的是否正确。
model = Sequential()
# Encoder
model.add(LSTM(32, activation='relu', input_shape =(timesteps, n_features ), return_sequences=True))
model.add(LSTM(16, activation='relu', return_sequences=False))
model.add(RepeatVector(timesteps))
# Decoder
model.add(LSTM(16, activation='relu', return_sequences=True))
model.add(LSTM(32, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))'''
然后使用批量大小参数拟合模型
model.fit(data, data,
epochs=30,
batch_size = 32)
该模型是用 mse
损失函数编译的,似乎可以学习。
为了获得测试数据的编码器输出,我使用了 K 函数:
get_encoder_output = K.function([model.layers[0].input],
[model.layers[1].output])
encoder_output = get_encoder_output([test_data])[0]
我的第一个问题是模型指定是否正确。特别是是否需要 RepeatVector 层。我不确定它在做什么。如果我省略它并用return_sequences = True
指定前一层怎么办?
我的第二个问题是我是否需要告诉get_encoder_output
训练中使用的batch_size
?
提前感谢您对这两个问题的任何帮助。
在我看来,在 Keras 中实现 seq2seq LSTM 的最佳方法是使用 2 个 LSTM 模型并将第一个模型将其状态转移到第二个模型。
编码器中的最后一个 LSTM 层将需要
return_state=True ,return_sequences=False
所以它将传递它的 h
和 c
.
然后您需要设置一个 LSTM 解码器来接收这些 initial_state
。
对于解码器输入,您很可能希望将 "start of sequence" 标记作为第一个时间步长的输入,然后使用 nth
时间步长的解码器输出作为解码器的输入(n+1)th
时间步。
掌握了这些,再看看逼老师。
This 可能对您有用:
作为一个玩具问题,我创建了一个 seq2seq 模型来预测不同正弦波的连续性。
这是模特:
def create_seq2seq():
features_num=5
latent_dim=40
##
encoder_inputs = Input(shape=(None, features_num))
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoder_inputs)
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoded)
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoded)
encoded = LSTM(latent_dim, return_state=True)(encoded)
encoder = Model (input=encoder_inputs, output=encoded)
##
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs=Input(shape=(1, features_num))
decoder_lstm_1 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_2 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_3 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_4 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_dense = Dense(features_num)
all_outputs = []
inputs = decoder_inputs
states_1=encoder_states
# Placeholder values:
states_2=states_1; states_3=states_1; states_4=states_1
###
for _ in range(1):
# Run the decoder on the first timestep
outputs_1, state_h_1, state_c_1 = decoder_lstm_1(inputs, initial_state=states_1)
outputs_2, state_h_2, state_c_2 = decoder_lstm_2(outputs_1)
outputs_3, state_h_3, state_c_3 = decoder_lstm_3(outputs_2)
outputs_4, state_h_4, state_c_4 = decoder_lstm_4(outputs_3)
# Store the current prediction (we will concatenate all predictions later)
outputs = decoder_dense(outputs_4)
all_outputs.append(outputs)
# Reinject the outputs as inputs for the next loop iteration
# as well as update the states
inputs = outputs
states_1 = [state_h_1, state_c_1]
states_2 = [state_h_2, state_c_2]
states_3 = [state_h_3, state_c_3]
states_4 = [state_h_4, state_c_4]
for _ in range(149):
# Run the decoder on each timestep
outputs_1, state_h_1, state_c_1 = decoder_lstm_1(inputs, initial_state=states_1)
outputs_2, state_h_2, state_c_2 = decoder_lstm_2(outputs_1, initial_state=states_2)
outputs_3, state_h_3, state_c_3 = decoder_lstm_3(outputs_2, initial_state=states_3)
outputs_4, state_h_4, state_c_4 = decoder_lstm_4(outputs_3, initial_state=states_4)
# Store the current prediction (we will concatenate all predictions later)
outputs = decoder_dense(outputs_4)
all_outputs.append(outputs)
# Reinject the outputs as inputs for the next loop iteration
# as well as update the states
inputs = outputs
states_1 = [state_h_1, state_c_1]
states_2 = [state_h_2, state_c_2]
states_3 = [state_h_3, state_c_3]
states_4 = [state_h_4, state_c_4]
# Concatenate all predictions
decoder_outputs = Lambda(lambda x: K.concatenate(x, axis=1))(all_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
#model = load_model('pre_model.h5')
print(model.summary()
return (model)