无法在 Keras 中训练简单的自动编码器
Unable to train simple autoencoder in Keras
我正在尝试在 Keras 中训练一个自动编码器进行信号处理,但不知为何失败了。
我的输入是 6 个度量(acceleration_x/y/z、gyro_x/y/z)的 128 帧长度的片段,因此我的数据集的整体形状是 (22836, 128, 6)
,其中 22836 是样本大小。
这是我用于自动编码器的示例代码:
X_train, X_test, Y_train, Y_test = load_dataset()
# reshape the input, whose size is (22836, 128, 6)
X_train = X_train.reshape(X_train.shape[0], np.prod(X_train.shape[1:]))
X_test = X_test.reshape(X_test.shape[0], np.prod(X_test.shape[1:]))
# now the shape will be (22836, 768)
### MODEL ###
input_shape = [X_train.shape[1]]
X_input = Input(input_shape)
x = Dense(1000, activation='sigmoid', name='enc0')(X_input)
encoded = Dense(350, activation='sigmoid', name='enc1')(x)
x = Dense(1000, activation='sigmoid', name='dec0')(encoded)
decoded = Dense(input_shape[0], activation='sigmoid', name='dec1')(x)
model = Model(inputs=X_input, outputs=decoded, name='autoencoder')
model.compile(optimizer='rmsprop', loss='mean_squared_error')
print(model.summary())
model.summary()
的输出是
Model summary
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_55 (InputLayer) (None, 768) 0
_________________________________________________________________
enc0 (Dense) (None, 1000) 769000
_________________________________________________________________
enc1 (Dense) (None, 350) 350350
_________________________________________________________________
dec1 (Dense) (None, 1000) 351000
_________________________________________________________________
dec0 (Dense) (None, 768) 768768
=================================================================
Total params: 2,239,118
Trainable params: 2,239,118
Non-trainable params: 0
训练通过
完成
# train the model
history = model.fit(x = X_train, y = X_train,
epochs=5,
batch_size=32,
validation_data=(X_test, X_test))
我只是想学习产生的身份函数:
Train on 22836 samples, validate on 5709 samples
Epoch 1/5
22836/22836 [==============================] - 27s 1ms/step - loss: 0.9481 - val_loss: 0.8862
Epoch 2/5
22836/22836 [==============================] - 24s 1ms/step - loss: 0.8669 - val_loss: 0.8358
Epoch 3/5
22836/22836 [==============================] - 25s 1ms/step - loss: 0.8337 - val_loss: 0.8146
Epoch 4/5
22836/22836 [==============================] - 25s 1ms/step - loss: 0.8164 - val_loss: 0.7960
Epoch 5/5
22836/22836 [==============================] - 25s 1ms/step - loss: 0.8004 - val_loss: 0.7819
此时,为了了解它的表现如何,我检查了一些真实输入与预测输入的关系图:
prediction = model.predict(X_test)
for i in np.random.randint(0, 100, 7):
pred = prediction[i, :].reshape(128,6)
# getting only values for acceleration_x
pred = pred[:, 0]
true = X_test[i, :].reshape(128,6)
# getting only values for acceleration_x
true = true[:, 0]
# plot original and reconstructed
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(20, 6))
ax1.plot(true, color='green')
ax2.plot(pred, color='red')
这些是一些看起来完全错误的情节:
除了少量的 epoch(实际上似乎没有任何区别)之外,您对问题有什么建议吗?
你的数据不在[0,1]范围内,那你为什么在最后一层使用sigmoid
作为激活函数呢?从最后一层移除激活函数(在前面的层中使用 relu
可能会更好)。
同时规范化训练数据。您可以使用特征规范化:
X_mean = X_train.mean(axis=0)
X_train -= X_mean
X_std = X_train.std(axis=0)
X_train /= X_std + 1e-8
并且不要忘记在推理时间(即测试)中使用计算统计数据(X_mean
和 X_std
)来规范化测试数据。
我正在尝试在 Keras 中训练一个自动编码器进行信号处理,但不知为何失败了。
我的输入是 6 个度量(acceleration_x/y/z、gyro_x/y/z)的 128 帧长度的片段,因此我的数据集的整体形状是 (22836, 128, 6)
,其中 22836 是样本大小。
这是我用于自动编码器的示例代码:
X_train, X_test, Y_train, Y_test = load_dataset()
# reshape the input, whose size is (22836, 128, 6)
X_train = X_train.reshape(X_train.shape[0], np.prod(X_train.shape[1:]))
X_test = X_test.reshape(X_test.shape[0], np.prod(X_test.shape[1:]))
# now the shape will be (22836, 768)
### MODEL ###
input_shape = [X_train.shape[1]]
X_input = Input(input_shape)
x = Dense(1000, activation='sigmoid', name='enc0')(X_input)
encoded = Dense(350, activation='sigmoid', name='enc1')(x)
x = Dense(1000, activation='sigmoid', name='dec0')(encoded)
decoded = Dense(input_shape[0], activation='sigmoid', name='dec1')(x)
model = Model(inputs=X_input, outputs=decoded, name='autoencoder')
model.compile(optimizer='rmsprop', loss='mean_squared_error')
print(model.summary())
model.summary()
的输出是
Model summary
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_55 (InputLayer) (None, 768) 0
_________________________________________________________________
enc0 (Dense) (None, 1000) 769000
_________________________________________________________________
enc1 (Dense) (None, 350) 350350
_________________________________________________________________
dec1 (Dense) (None, 1000) 351000
_________________________________________________________________
dec0 (Dense) (None, 768) 768768
=================================================================
Total params: 2,239,118
Trainable params: 2,239,118
Non-trainable params: 0
训练通过
完成# train the model
history = model.fit(x = X_train, y = X_train,
epochs=5,
batch_size=32,
validation_data=(X_test, X_test))
我只是想学习产生的身份函数:
Train on 22836 samples, validate on 5709 samples
Epoch 1/5
22836/22836 [==============================] - 27s 1ms/step - loss: 0.9481 - val_loss: 0.8862
Epoch 2/5
22836/22836 [==============================] - 24s 1ms/step - loss: 0.8669 - val_loss: 0.8358
Epoch 3/5
22836/22836 [==============================] - 25s 1ms/step - loss: 0.8337 - val_loss: 0.8146
Epoch 4/5
22836/22836 [==============================] - 25s 1ms/step - loss: 0.8164 - val_loss: 0.7960
Epoch 5/5
22836/22836 [==============================] - 25s 1ms/step - loss: 0.8004 - val_loss: 0.7819
此时,为了了解它的表现如何,我检查了一些真实输入与预测输入的关系图:
prediction = model.predict(X_test)
for i in np.random.randint(0, 100, 7):
pred = prediction[i, :].reshape(128,6)
# getting only values for acceleration_x
pred = pred[:, 0]
true = X_test[i, :].reshape(128,6)
# getting only values for acceleration_x
true = true[:, 0]
# plot original and reconstructed
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(20, 6))
ax1.plot(true, color='green')
ax2.plot(pred, color='red')
这些是一些看起来完全错误的情节:
除了少量的 epoch(实际上似乎没有任何区别)之外,您对问题有什么建议吗?
你的数据不在[0,1]范围内,那你为什么在最后一层使用sigmoid
作为激活函数呢?从最后一层移除激活函数(在前面的层中使用 relu
可能会更好)。
同时规范化训练数据。您可以使用特征规范化:
X_mean = X_train.mean(axis=0)
X_train -= X_mean
X_std = X_train.std(axis=0)
X_train /= X_std + 1e-8
并且不要忘记在推理时间(即测试)中使用计算统计数据(X_mean
和 X_std
)来规范化测试数据。