LSTM 自动编码器,使用第一个 LSTM 输出作为解码器的目标

LSTM Auto Encoder, use first LSTM output as the target for the decoder

具有一系列 10 天的传感器事件,以及真/假标签,指定传感器是否在 10 天内触发了警报:

sensor_id 时间戳 feature_1 feature_2 10_days_alert_label
1 2020-12-20 01:00:34.565 0.23 0.1 1
1 2020-12-20 01:03:13.897 0.3 0.12 1
2 2020-12-20 01:00:34.565 0.13 0.4 0
2 2020-12-20 01:03:13.897 0.2 0.9 0

95% 的传感器没有触发警报,因此数据不平衡。我正在考虑使用自动编码器模型来检测异常(触发警报的传感器)。由于我对解码整个序列不感兴趣,只对 LSTM 学习的上下文向量感兴趣,所以我想到了如下图所示的内容,其中解码器正在重建编码器输出:

我四处搜索并找到了这个简单的 LSTM 自动编码器示例:

# lstm autoencoder recreate sequence
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import RepeatVector
from tensorflow.keras.layers import TimeDistributed
from tensorflow.keras.utils import plot_model
# define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
print(yhat[0,:,0])

我想修改上面的示例,以便将第一个 LSTM 输出用作解码器目标。类似于:

# lstm autoencoder recreate sequence
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import RepeatVector
from tensorflow.keras.layers import TimeDistributed
from tensorflow.keras.utils import plot_model
# define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))

model.add(Dense(100, activation='relu')) # First LSTM output
model.add(Dense(32, activation='relu')) # Bottleneck 
model.add(Dense(100, activation='sigmoid')) # Decoded vector

model.compile(optimizer='adam', loss='mse')

# fit model
model.fit(sequence, FIRST_LSTM_OUTPUT, epochs=300, verbose=0) # <--- ???

问:我可以使用第一个 LSTM 输出向量作为目标吗?

您可以使用 model.add_loss 来完成。在 add_loss 中,我们指定我们感兴趣的损失(在我们的例子中:mse)并设置用于计算它的层(在我们的例子中:LSTM 输出和模型预测)

下面是一个虚拟示例:

n_sample, timesteps = 100, 9
X = np.random.uniform(0,1, (100, 9, 1))

def mse(enc_output, pred):
    return  tf.reduce_mean(tf.square(enc_output - pred))
    
inp = Input((timesteps,1,))
enc = LSTM(100, activation='relu')(inp)
x = Dense(100, activation='relu')(enc)
x = Dense(32, activation='relu')(x)
out = Dense(100, activation='sigmoid')(x)
model = Model(inp, out)

model.add_loss(mse(enc, out))
model.compile(optimizer='adam', loss=None)
model.fit(X, y=None, epochs=3)

Here 运行 代码

您需要的是找到一种方法来计算中间层的 损失张量 或隐藏激活函数的输出(例如第一个 LSTM 和last Dense 在你的情况下)。在 we can do that using the add_loss() method. It potentially depends on the layer inputs (tensor). However, you can read the purpose of this add_loss from my in great detail. You can also check this discussion thread 关于这个问题。

# data 
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

# layers 
inputs = tf.keras.Input(shape=(n_in,1))
lx = LSTM(100, activation=tf.nn.relu)(inputs) # first lstm 
x = Dense(100, activation=tf.nn.relu)(lx)
x = Dense(32, activation=tf.nn.relu)(x)
outputs = Dense(100, activation=tf.nn.sigmoid)(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

# add_loss - compute mse of 
# model's last layer with output of first lstm 
model.add_loss(tf.keras.metrics.mean_squared_error(outputs, lx))
model.compile(optimizer='adam') 

# fit model
# no need to pass first lstm, we added it already using add_loss method 
model.fit(sequence, epochs=3, verbose=1) 

Epoch 1/3
1/1 [==============================] - 1s 1s/step - loss: 0.2270
Epoch 2/3
1/1 [==============================] - 0s 18ms/step - loss: 0.2250
Epoch 3/3
1/1 [==============================] - 0s 15ms/step - loss: 0.2227
model.predict(sequence).shape
(1, 100)