为什么 GRU 自动编码器中的一些隐藏单元 return 为零?
Why some of the hidden units return zero in the GRU autoencoder?
我已经实现了一个递归神经网络自动编码器,如下所示:
def AE_GRU(X):
inputs = Input(shape=(X.shape[1], X.shape[2]), name="input")
L1 = GRU(8, activation="relu", return_sequences=True, kernel_regularizer=regularizers.l2(0.00), name="E1")(inputs)
L2 = GRU(4, activation="relu", return_sequences=False, name="E2")(L1)
L3 = RepeatVector(X.shape[1], name="RepeatVector")(L2)
L4 = GRU(4, activation="relu", return_sequences=True, name="D1")(L3)
L5 = GRU(8, activation="relu", return_sequences=True, name="D2")(L4)
output = TimeDistributed(Dense(X.shape[2]), name="output")(L5)
model = Model(inputs=inputs, outputs=[output])
return model
然后我运行下面的代码来训练AE:
model = AE_GRU(trainX)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=optimizer, loss="mse")
model.summary()
epochs = 5
batch_size = 64
history = model.fit(
trainX, trainX,
epochs=epochs, batch_size=batch_size,
validation_data=(valX, valX)
).history
我还在下面附上了model.summary()
的结果。
最后,我通过 运行 下面的代码提取第二个隐藏层输出。
def all_hidden_layers_output(iModel, dtset):
inp = iModel.input # input placeholder
outputs = [layer.output for layer in iModel.layers] # all layer outputs
functors = [K.function([inp], [out]) for out in outputs] # evaluation functions
layer_outs = [func([dtset]) for func in functors]
return layer_outs
hidden_state_train = all_hidden_layers_output(model, trainX)[2][0]
hidden_state_val = all_hidden_layers_output(model, valX)[2][0]
# remove zeros_columns:
hidden_state_train = hidden_state_train[:,~np.all(hidden_state_train==0.0, axis=0)]
hidden_state_val = hidden_state_val[:,~np.all(hidden_state_val==0.0, axis=0)]
print(f"hidden_state_train.shape={hidden_state_train.shape}")
print(f"hidden_state_val.shape={hidden_state_val.shape}")
但我不知道为什么这一层return的某些单位一直为零。我希望得到hidden_state_train
和hidden_state_val
作为具有 4 个非零列的 2D numpy 数组(基于 model.summary()
信息)。 如有任何帮助,我们将不胜感激。
这可能是因为 dying relu 的问题。对于负值,relu 为 0。看看这个(https://towardsdatascience.com/the-dying-relu-problem-clearly-explained-42d0c54e0d24)对问题的解释。
我已经实现了一个递归神经网络自动编码器,如下所示:
def AE_GRU(X):
inputs = Input(shape=(X.shape[1], X.shape[2]), name="input")
L1 = GRU(8, activation="relu", return_sequences=True, kernel_regularizer=regularizers.l2(0.00), name="E1")(inputs)
L2 = GRU(4, activation="relu", return_sequences=False, name="E2")(L1)
L3 = RepeatVector(X.shape[1], name="RepeatVector")(L2)
L4 = GRU(4, activation="relu", return_sequences=True, name="D1")(L3)
L5 = GRU(8, activation="relu", return_sequences=True, name="D2")(L4)
output = TimeDistributed(Dense(X.shape[2]), name="output")(L5)
model = Model(inputs=inputs, outputs=[output])
return model
然后我运行下面的代码来训练AE:
model = AE_GRU(trainX)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=optimizer, loss="mse")
model.summary()
epochs = 5
batch_size = 64
history = model.fit(
trainX, trainX,
epochs=epochs, batch_size=batch_size,
validation_data=(valX, valX)
).history
我还在下面附上了model.summary()
的结果。
最后,我通过 运行 下面的代码提取第二个隐藏层输出。
def all_hidden_layers_output(iModel, dtset):
inp = iModel.input # input placeholder
outputs = [layer.output for layer in iModel.layers] # all layer outputs
functors = [K.function([inp], [out]) for out in outputs] # evaluation functions
layer_outs = [func([dtset]) for func in functors]
return layer_outs
hidden_state_train = all_hidden_layers_output(model, trainX)[2][0]
hidden_state_val = all_hidden_layers_output(model, valX)[2][0]
# remove zeros_columns:
hidden_state_train = hidden_state_train[:,~np.all(hidden_state_train==0.0, axis=0)]
hidden_state_val = hidden_state_val[:,~np.all(hidden_state_val==0.0, axis=0)]
print(f"hidden_state_train.shape={hidden_state_train.shape}")
print(f"hidden_state_val.shape={hidden_state_val.shape}")
但我不知道为什么这一层return的某些单位一直为零。我希望得到hidden_state_train
和hidden_state_val
作为具有 4 个非零列的 2D numpy 数组(基于 model.summary()
信息)。 如有任何帮助,我们将不胜感激。
这可能是因为 dying relu 的问题。对于负值,relu 为 0。看看这个(https://towardsdatascience.com/the-dying-relu-problem-clearly-explained-42d0c54e0d24)对问题的解释。