在构建 encoder/decoder 模型之前训练自动编码器是否有效?
Is it valid to train the autoencoder before building the encoder/decoder models?
我正在按照教程 https://blog.keras.io/building-autoencoders-in-keras.html 构建我的自动编码器。为此,我有两个策略:
A) 第一步:构建自编码器;第二步:构建编码器;第三步:构建解码器;第四步:编译自编码器;第五步:训练自动编码器。
B) 第一步:构建自编码器;第二步:编译自动编码器;第三步:训练autoencoder;第四步:构建编码器;第五步:构建解码器。
对于这两种情况,模型收敛到损失 0.100。然而,对于教程中所述的策略 A,重建效果非常差。在策略 B 的情况下,重建要好得多。
我认为这是有道理的,因为在策略 A 中,编码器和解码器模型的权重是在未经训练的层上构建的,结果是随机的。另一方面,在策略 B 中,我在训练后更好地定义了权重,因此重建更好。
我的问题是,策略 B 有效还是我在重建时作弊?对于策略 A,Keras 是否应该自动更新编码器和解码器模型的权重,因为它们的模型是基于自动编码器层构建的?
###### Code for Strategy A
# Step 1
features = Input(shape=(x_train.shape[1],))
encoded = Dense(1426, activation='relu')(features)
encoded = Dense(732, activation='relu')(encoded)
encoded = Dense(328, activation='relu')(encoded)
encoded = Dense(encoding_dim, activation='relu')(encoded)
decoded = Dense(328, activation='relu')(encoded)
decoded = Dense(732, activation='relu')(decoded)
decoded = Dense(1426, activation='relu')(decoded)
decoded = Dense(x_train.shape[1], activation='relu')(decoded)
autoencoder = Model(inputs=features, outputs=decoded)
# Step 2
encoder = Model(features, encoded)
# Step 3
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-4](encoded_input)
decoder_layer = autoencoder.layers[-3](decoder_layer)
decoder_layer = autoencoder.layers[-2](decoder_layer)
decoder_layer = autoencoder.layers[-1](decoder_layer)
decoder = Model(encoded_input, decoder_layer)
# Step 4
autoencoder.compile(optimizer='adam', loss='mse')
# Step 5
history = autoencoder.fit(x_train,
x_train,
epochs=150,
batch_size=256,
shuffle=True,
verbose=1,
validation_split=0.2)
# Testing encoding
encoded_fts = encoder.predict(x_test)
decoded_fts = decoder.predict(encoded_fts)
###### Code for Strategy B
# Step 1
features = Input(shape=(x_train.shape[1],))
encoded = Dense(1426, activation='relu')(features)
encoded = Dense(732, activation='relu')(encoded)
encoded = Dense(328, activation='relu')(encoded)
encoded = Dense(encoding_dim, activation='relu')(encoded)
decoded = Dense(328, activation='relu')(encoded)
decoded = Dense(732, activation='relu')(decoded)
decoded = Dense(1426, activation='relu')(decoded)
decoded = Dense(x_train.shape[1], activation='relu')(decoded)
autoencoder = Model(inputs=features, outputs=decoded)
# Step 2
autoencoder.compile(optimizer='adam', loss='mse')
# Step 3
history = autoencoder.fit(x_train,
x_train,
epochs=150,
batch_size=256,
shuffle=True,
verbose=1,
validation_split=0.2)
# Step 4
encoder = Model(features, encoded)
# Step 5
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-4](encoded_input)
decoder_layer = autoencoder.layers[-3](decoder_layer)
decoder_layer = autoencoder.layers[-2](decoder_layer)
decoder_layer = autoencoder.layers[-1](decoder_layer)
decoder = Model(encoded_input, decoder_layer)
# Testing encoding
encoded_fts = encoder.predict(x_test)
decoded_fts = decoder.predict(encoded_fts)
My questions are, is strategy B valid or I am cheating on the reconstruction?
A
和B
是等价的;不,你没有作弊。
In case of strategy A, is Keras supposed to update the weights of the encoder and decoder models automatically since their models were built based on the autoencoder layers?
解码器模型只使用自动编码器层。如果 A
:
decoder.layers
Out:
[<keras.engine.input_layer.InputLayer at 0x7f8a44d805c0>,
<keras.layers.core.Dense at 0x7f8a44e58400>,
<keras.layers.core.Dense at 0x7f8a44e746d8>,
<keras.layers.core.Dense at 0x7f8a44e14940>,
<keras.layers.core.Dense at 0x7f8a44e2dba8>]
autoencoder.layers
Out:[<keras.engine.input_layer.InputLayer at 0x7f8a44e91c18>,
<keras.layers.core.Dense at 0x7f8a44e91c50>,
<keras.layers.core.Dense at 0x7f8a44e91ef0>,
<keras.layers.core.Dense at 0x7f8a44e89080>,
<keras.layers.core.Dense at 0x7f8a44e89da0>,
<keras.layers.core.Dense at 0x7f8a44e58400>,
<keras.layers.core.Dense at 0x7f8a44e746d8>,
<keras.layers.core.Dense at 0x7f8a44e14940>,
<keras.layers.core.Dense at 0x7f8a44e2dba8>]
每个列表最后 4 行的十六进制数字(对象 ID)完全相同 - 因为它们是相同的对象。当然,他们也有相同的权重。
如果B
:
decoder.layers
Out:
[<keras.engine.input_layer.InputLayer at 0x7f8a41de05f8>,
<keras.layers.core.Dense at 0x7f8a41ee4828>,
<keras.layers.core.Dense at 0x7f8a41eaceb8>,
<keras.layers.core.Dense at 0x7f8a41e50ac8>,
<keras.layers.core.Dense at 0x7f8a41e5d780>]
autoencoder.layers
Out:
[<keras.engine.input_layer.InputLayer at 0x7f8a41da3940>,
<keras.layers.core.Dense at 0x7f8a41da3978>,
<keras.layers.core.Dense at 0x7f8a41da3a90>,
<keras.layers.core.Dense at 0x7f8a41da3b70>,
<keras.layers.core.Dense at 0x7f8a44720cf8>,
<keras.layers.core.Dense at 0x7f8a41ee4828>,
<keras.layers.core.Dense at 0x7f8a41eaceb8>,
<keras.layers.core.Dense at 0x7f8a41e50ac8>,
<keras.layers.core.Dense at 0x7f8a41e5d780>]
-层数相同,至。
所以,A
和 B
的训练顺序是等价的。更一般地说,如果您共享层(以及权重),那么构建、编译和训练的顺序在大多数情况下并不重要,因为它们在同一个张量流图中。
我 运行 这个例子在 mnist
数据集上,它们表现出相同的性能并且重建图像很好。我想,如果您遇到案例 A
的问题,您就错过了其他事情(我知道怎么回事,因为我复制粘贴了您的代码,一切正常)。
如果您使用 jupyter,有时会重新启动 运行 从上到下的帮助。
我正在按照教程 https://blog.keras.io/building-autoencoders-in-keras.html 构建我的自动编码器。为此,我有两个策略:
A) 第一步:构建自编码器;第二步:构建编码器;第三步:构建解码器;第四步:编译自编码器;第五步:训练自动编码器。
B) 第一步:构建自编码器;第二步:编译自动编码器;第三步:训练autoencoder;第四步:构建编码器;第五步:构建解码器。
对于这两种情况,模型收敛到损失 0.100。然而,对于教程中所述的策略 A,重建效果非常差。在策略 B 的情况下,重建要好得多。
我认为这是有道理的,因为在策略 A 中,编码器和解码器模型的权重是在未经训练的层上构建的,结果是随机的。另一方面,在策略 B 中,我在训练后更好地定义了权重,因此重建更好。
我的问题是,策略 B 有效还是我在重建时作弊?对于策略 A,Keras 是否应该自动更新编码器和解码器模型的权重,因为它们的模型是基于自动编码器层构建的?
###### Code for Strategy A
# Step 1
features = Input(shape=(x_train.shape[1],))
encoded = Dense(1426, activation='relu')(features)
encoded = Dense(732, activation='relu')(encoded)
encoded = Dense(328, activation='relu')(encoded)
encoded = Dense(encoding_dim, activation='relu')(encoded)
decoded = Dense(328, activation='relu')(encoded)
decoded = Dense(732, activation='relu')(decoded)
decoded = Dense(1426, activation='relu')(decoded)
decoded = Dense(x_train.shape[1], activation='relu')(decoded)
autoencoder = Model(inputs=features, outputs=decoded)
# Step 2
encoder = Model(features, encoded)
# Step 3
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-4](encoded_input)
decoder_layer = autoencoder.layers[-3](decoder_layer)
decoder_layer = autoencoder.layers[-2](decoder_layer)
decoder_layer = autoencoder.layers[-1](decoder_layer)
decoder = Model(encoded_input, decoder_layer)
# Step 4
autoencoder.compile(optimizer='adam', loss='mse')
# Step 5
history = autoencoder.fit(x_train,
x_train,
epochs=150,
batch_size=256,
shuffle=True,
verbose=1,
validation_split=0.2)
# Testing encoding
encoded_fts = encoder.predict(x_test)
decoded_fts = decoder.predict(encoded_fts)
###### Code for Strategy B
# Step 1
features = Input(shape=(x_train.shape[1],))
encoded = Dense(1426, activation='relu')(features)
encoded = Dense(732, activation='relu')(encoded)
encoded = Dense(328, activation='relu')(encoded)
encoded = Dense(encoding_dim, activation='relu')(encoded)
decoded = Dense(328, activation='relu')(encoded)
decoded = Dense(732, activation='relu')(decoded)
decoded = Dense(1426, activation='relu')(decoded)
decoded = Dense(x_train.shape[1], activation='relu')(decoded)
autoencoder = Model(inputs=features, outputs=decoded)
# Step 2
autoencoder.compile(optimizer='adam', loss='mse')
# Step 3
history = autoencoder.fit(x_train,
x_train,
epochs=150,
batch_size=256,
shuffle=True,
verbose=1,
validation_split=0.2)
# Step 4
encoder = Model(features, encoded)
# Step 5
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-4](encoded_input)
decoder_layer = autoencoder.layers[-3](decoder_layer)
decoder_layer = autoencoder.layers[-2](decoder_layer)
decoder_layer = autoencoder.layers[-1](decoder_layer)
decoder = Model(encoded_input, decoder_layer)
# Testing encoding
encoded_fts = encoder.predict(x_test)
decoded_fts = decoder.predict(encoded_fts)
My questions are, is strategy B valid or I am cheating on the reconstruction?
A
和B
是等价的;不,你没有作弊。
In case of strategy A, is Keras supposed to update the weights of the encoder and decoder models automatically since their models were built based on the autoencoder layers?
解码器模型只使用自动编码器层。如果 A
:
decoder.layers
Out:
[<keras.engine.input_layer.InputLayer at 0x7f8a44d805c0>,
<keras.layers.core.Dense at 0x7f8a44e58400>,
<keras.layers.core.Dense at 0x7f8a44e746d8>,
<keras.layers.core.Dense at 0x7f8a44e14940>,
<keras.layers.core.Dense at 0x7f8a44e2dba8>]
autoencoder.layers
Out:[<keras.engine.input_layer.InputLayer at 0x7f8a44e91c18>,
<keras.layers.core.Dense at 0x7f8a44e91c50>,
<keras.layers.core.Dense at 0x7f8a44e91ef0>,
<keras.layers.core.Dense at 0x7f8a44e89080>,
<keras.layers.core.Dense at 0x7f8a44e89da0>,
<keras.layers.core.Dense at 0x7f8a44e58400>,
<keras.layers.core.Dense at 0x7f8a44e746d8>,
<keras.layers.core.Dense at 0x7f8a44e14940>,
<keras.layers.core.Dense at 0x7f8a44e2dba8>]
每个列表最后 4 行的十六进制数字(对象 ID)完全相同 - 因为它们是相同的对象。当然,他们也有相同的权重。
如果B
:
decoder.layers
Out:
[<keras.engine.input_layer.InputLayer at 0x7f8a41de05f8>,
<keras.layers.core.Dense at 0x7f8a41ee4828>,
<keras.layers.core.Dense at 0x7f8a41eaceb8>,
<keras.layers.core.Dense at 0x7f8a41e50ac8>,
<keras.layers.core.Dense at 0x7f8a41e5d780>]
autoencoder.layers
Out:
[<keras.engine.input_layer.InputLayer at 0x7f8a41da3940>,
<keras.layers.core.Dense at 0x7f8a41da3978>,
<keras.layers.core.Dense at 0x7f8a41da3a90>,
<keras.layers.core.Dense at 0x7f8a41da3b70>,
<keras.layers.core.Dense at 0x7f8a44720cf8>,
<keras.layers.core.Dense at 0x7f8a41ee4828>,
<keras.layers.core.Dense at 0x7f8a41eaceb8>,
<keras.layers.core.Dense at 0x7f8a41e50ac8>,
<keras.layers.core.Dense at 0x7f8a41e5d780>]
-层数相同,至。
所以,A
和 B
的训练顺序是等价的。更一般地说,如果您共享层(以及权重),那么构建、编译和训练的顺序在大多数情况下并不重要,因为它们在同一个张量流图中。
我 运行 这个例子在 mnist
数据集上,它们表现出相同的性能并且重建图像很好。我想,如果您遇到案例 A
的问题,您就错过了其他事情(我知道怎么回事,因为我复制粘贴了您的代码,一切正常)。
如果您使用 jupyter,有时会重新启动 运行 从上到下的帮助。