训练了一个复杂的自动编码器。现在需要帮助提取特征 space

Question

我使用自己的约 32k 图像数据集构建了一个自动编码器。我为 training/testing 进行了 75/25 拆分，我能够得到令我满意的结果。

现在我希望能够提取特征 space 并将它们映射到我的数据集中的每个图像和未测试的新数据。我在网上找不到深入研究使用编码器作为功能的教程 space。我所能找到的就是建立完整的网络。

我的代码：

> input_img = Input(shape=(200, 200, 1))
# encoder part of the model (increased filter lyaer after each filter)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# decoder part of the model (went backwards from the encoder)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoded = Cropping2D(cropping=((8,0), (8,0)), data_format=None)(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.summary()

如果有人感兴趣，这是我的网络设置：

Model: "model_22"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_23 (InputLayer)        (None, 200, 200, 1)       0         
_________________________________________________________________
conv2d_186 (Conv2D)          (None, 200, 200, 16)      160       
_________________________________________________________________
max_pooling2d_83 (MaxPooling (None, 100, 100, 16)      0         
_________________________________________________________________
conv2d_187 (Conv2D)          (None, 100, 100, 32)      4640      
_________________________________________________________________
max_pooling2d_84 (MaxPooling (None, 50, 50, 32)        0         
_________________________________________________________________
conv2d_188 (Conv2D)          (None, 50, 50, 64)        18496     
_________________________________________________________________
max_pooling2d_85 (MaxPooling (None, 25, 25, 64)        0         
_________________________________________________________________
conv2d_189 (Conv2D)          (None, 25, 25, 128)       73856     
_________________________________________________________________
max_pooling2d_86 (MaxPooling (None, 13, 13, 128)       0         
_________________________________________________________________
conv2d_190 (Conv2D)          (None, 13, 13, 128)       147584    
_________________________________________________________________
up_sampling2d_82 (UpSampling (None, 26, 26, 128)       0         
_________________________________________________________________
conv2d_191 (Conv2D)          (None, 26, 26, 64)        73792     
_________________________________________________________________
up_sampling2d_83 (UpSampling (None, 52, 52, 64)        0         
_________________________________________________________________
conv2d_192 (Conv2D)          (None, 52, 52, 32)        18464     
_________________________________________________________________
up_sampling2d_84 (UpSampling (None, 104, 104, 32)      0         
_________________________________________________________________
conv2d_193 (Conv2D)          (None, 104, 104, 16)      4624      
_________________________________________________________________
up_sampling2d_85 (UpSampling (None, 208, 208, 16)      0         
_________________________________________________________________
conv2d_194 (Conv2D)          (None, 208, 208, 1)       145       
_________________________________________________________________
cropping2d_2 (Cropping2D)    (None, 200, 200, 1)       0         
=================================================================
Total params: 341,761
Trainable params: 341,761
Non-trainable params: 0

然后我的训练：

autoencoder.fit(train, train,
                epochs=3,
                batch_size=128,
                shuffle=True,
                validation_data=(test, test))

我的结果：

Train on 23412 samples, validate on 7805 samples
Epoch 1/3
23412/23412 [==============================] - 773s 33ms/step - loss: 0.0620 - val_loss: 0.0398
Epoch 2/3
23412/23412 [==============================] - 715s 31ms/step - loss: 0.0349 - val_loss: 0.0349
Epoch 3/3
23412/23412 [==============================] - 753s 32ms/step - loss: 0.0314 - val_loss: 0.0319

宁愿不分享图像，但它们看起来重建得很好。

感谢您的所有帮助！

Answer 1

不确定我是否完全理解您的问题，但您是否希望为您训练的每张图像以及其他图像获得结果特征 space。为什么不这样做呢？

将自动编码器架构中的编码层命名为 'embedding.' 然后按以下方式创建编码器：

embedding_layer = autoencoder.get_layer(name='embedding').output
encoder = Model(input_img,embedding_layer)

训练了一个复杂的自动编码器。现在需要帮助提取特征 space

Trained an convoluted autoencoder. Now need help extracting the feature space

machine-learning

feature-extraction

autoencoder

keras