保存经过降维训练的自动编码器
Saving an Auto-encoder trained to reduce dimensions
我做了一个降维自动编码器,我想把它保存下来用于测试数据集的降维。这是我的代码
dom_state = seed(123)
print('Rescaling Data')
y = minmax_scale(X, axis=0)
ncol = y.shape[1] #here ncol = 19
print('Encoding Dimensions')
encoding_dim = 3
input_dim = Input(shape = (ncol,))
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=24)) as sess:
K.set_session(sess)
print('Initiating Encoder Layer')
encoded1 = Dense(20, activation = 'relu')(input_dim)
encoded2 = Dense(10, activation = 'relu')(encoded1)
encoded3 = Dense(5, activation = 'relu')(encoded2)
encoded4 = Dense(encoding_dim, activation = 'relu')(encoded3)
print('Initiating Decoder Layer')
decoded1 = Dense(5, activation = 'relu')(encoded4)
decoded2 = Dense(10, activation = 'relu')(decoded1)
decoded3 = Dense(20, activation = 'relu')(decoded2)
decoded4 = Dense(ncol, activation = 'sigmoid')(decoded3)
print('Combine Encoder and Decoder layers')
autoencoder = Model(input = input_dim, output = decoded4)
print('Compiling Mode')
autoencoder.compile(optimizer = 'Nadam', loss ='mse')
autoencoder.fit(y, y, nb_epoch = 300, batch_size = 20, shuffle = True)
encoder = Model(input = input_dim, output = decoded4)
encoder.save('reduction_param.h5')
print('Initiating Dimension Reduction')
model = load_model('reduction_param.h5')
encoded_input = Input(shape = (encoding_dim, ))
encoded_out = model.predict(y)
然而,即使我限制了尺寸,在 model.predict(y) 部分,我仍然得到完整的 19 列而不是 3 列。此外,我还收到错误:
UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file:
我明白了,因为 encoder.save('reduction_param.h5')
实际上不是用优化器编译的。我错过了什么吗?
编辑:
我不知道这是否是解决问题的正确方法,基本上我将 MinMAXScaler() 训练到训练数据集,将特征保存为 pickle,然后在保持自动调整的同时重新使用它编码器,根据代码:
dom_state = seed(123)
print('Rescaling Data')
feature_space= MinMaxScaler()
feature_pkl = feature_space.fit(X)
filename = 'lc_feature_space.sav'
pickle.dump(feature_pkl, open(filename, 'wb'))
loaded_model = pickle.load(open(filename, 'rb'))
y = loaded_model.transform(X)
ncol = y.shape[1]
print(ncol)
print('Encoding Dimensions')
encoding_dim = 3
input_dim = Input(shape = (ncol,))
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=24)) as sess:
K.set_session(sess)
print('Initiating Encoder Layer')
encoded1 = Dense(20, activation = 'relu')(input_dim)
encoded2 = Dense(10, activation = 'relu')(encoded1)
encoded3 = Dense(5, activation = 'relu')(encoded2)
encoded4 = Dense(encoding_dim, activation = 'relu')(encoded3)
print('Initiating Decoder Layer')
decoded1 = Dense(5, activation = 'relu')(encoded4)
decoded2 = Dense(10, activation = 'relu')(decoded1)
decoded3 = Dense(20, activation = 'relu')(decoded2)
decoded4 = Dense(ncol, activation = 'sigmoid')(decoded3)
print('Combine Encoder and Deocoder layers')
autoencoder = Model(input = input_dim, output = decoded4)
print('Compiling Mode')
autoencoder.compile(optimizer = 'Nadam', loss ='mse')
autoencoder.fit(y, y, nb_epoch = 300, batch_size = 20, shuffle = True)
print('Initiating Dimension Reduction')
encoder = Model(input = input_dim, output = decoded4)
encoded_input = Input(shape = (encoding_dim, ))
encoded_out = encoder.predict(y)
result = encoded_out[0:2]
我这里的论点是在 MINMAXScaler() 级别保存训练数据集的特征,根据这些特征转换测试数据集,然后使用自动编码器减少。我仍然不知道这是否正确。
我认为您没有看到 encoder
正常工作(即降低输入张量的维数)的原因是您定义并保存了错误的模型。你应该使用
encoder = Model(input = input_dim, output = encoded4 )
其输出节点是encoded4
而不是decoded4
。
我做了一个降维自动编码器,我想把它保存下来用于测试数据集的降维。这是我的代码
dom_state = seed(123)
print('Rescaling Data')
y = minmax_scale(X, axis=0)
ncol = y.shape[1] #here ncol = 19
print('Encoding Dimensions')
encoding_dim = 3
input_dim = Input(shape = (ncol,))
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=24)) as sess:
K.set_session(sess)
print('Initiating Encoder Layer')
encoded1 = Dense(20, activation = 'relu')(input_dim)
encoded2 = Dense(10, activation = 'relu')(encoded1)
encoded3 = Dense(5, activation = 'relu')(encoded2)
encoded4 = Dense(encoding_dim, activation = 'relu')(encoded3)
print('Initiating Decoder Layer')
decoded1 = Dense(5, activation = 'relu')(encoded4)
decoded2 = Dense(10, activation = 'relu')(decoded1)
decoded3 = Dense(20, activation = 'relu')(decoded2)
decoded4 = Dense(ncol, activation = 'sigmoid')(decoded3)
print('Combine Encoder and Decoder layers')
autoencoder = Model(input = input_dim, output = decoded4)
print('Compiling Mode')
autoencoder.compile(optimizer = 'Nadam', loss ='mse')
autoencoder.fit(y, y, nb_epoch = 300, batch_size = 20, shuffle = True)
encoder = Model(input = input_dim, output = decoded4)
encoder.save('reduction_param.h5')
print('Initiating Dimension Reduction')
model = load_model('reduction_param.h5')
encoded_input = Input(shape = (encoding_dim, ))
encoded_out = model.predict(y)
然而,即使我限制了尺寸,在 model.predict(y) 部分,我仍然得到完整的 19 列而不是 3 列。此外,我还收到错误:
UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file:
我明白了,因为 encoder.save('reduction_param.h5')
实际上不是用优化器编译的。我错过了什么吗?
编辑:
我不知道这是否是解决问题的正确方法,基本上我将 MinMAXScaler() 训练到训练数据集,将特征保存为 pickle,然后在保持自动调整的同时重新使用它编码器,根据代码:
dom_state = seed(123)
print('Rescaling Data')
feature_space= MinMaxScaler()
feature_pkl = feature_space.fit(X)
filename = 'lc_feature_space.sav'
pickle.dump(feature_pkl, open(filename, 'wb'))
loaded_model = pickle.load(open(filename, 'rb'))
y = loaded_model.transform(X)
ncol = y.shape[1]
print(ncol)
print('Encoding Dimensions')
encoding_dim = 3
input_dim = Input(shape = (ncol,))
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=24)) as sess:
K.set_session(sess)
print('Initiating Encoder Layer')
encoded1 = Dense(20, activation = 'relu')(input_dim)
encoded2 = Dense(10, activation = 'relu')(encoded1)
encoded3 = Dense(5, activation = 'relu')(encoded2)
encoded4 = Dense(encoding_dim, activation = 'relu')(encoded3)
print('Initiating Decoder Layer')
decoded1 = Dense(5, activation = 'relu')(encoded4)
decoded2 = Dense(10, activation = 'relu')(decoded1)
decoded3 = Dense(20, activation = 'relu')(decoded2)
decoded4 = Dense(ncol, activation = 'sigmoid')(decoded3)
print('Combine Encoder and Deocoder layers')
autoencoder = Model(input = input_dim, output = decoded4)
print('Compiling Mode')
autoencoder.compile(optimizer = 'Nadam', loss ='mse')
autoencoder.fit(y, y, nb_epoch = 300, batch_size = 20, shuffle = True)
print('Initiating Dimension Reduction')
encoder = Model(input = input_dim, output = decoded4)
encoded_input = Input(shape = (encoding_dim, ))
encoded_out = encoder.predict(y)
result = encoded_out[0:2]
我这里的论点是在 MINMAXScaler() 级别保存训练数据集的特征,根据这些特征转换测试数据集,然后使用自动编码器减少。我仍然不知道这是否正确。
我认为您没有看到 encoder
正常工作(即降低输入张量的维数)的原因是您定义并保存了错误的模型。你应该使用
encoder = Model(input = input_dim, output = encoded4 )
其输出节点是encoded4
而不是decoded4
。