输入 0 与层 repeat_vector_40 不兼容:预期 ndim=2,发现 ndim=1

Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1

我正在开发用于异常检测的 LSTM 自动编码器模型。我的 keras 模型设置如下:

from keras.models import Sequential

from keras import Model, layers
from keras.layers import Layer, Conv1D, Input, Masking, Dense, RNN, LSTM, Dropout, RepeatVector, TimeDistributed, Masking, Reshape

def create_RNN_with_attention():
    x=Input(shape=(X_train_dt.shape[1], X_train_dt.shape[2]))
    RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
    attention_layer = attention()(RNN_layer_1)
    dropout_layer_1 = Dropout(rate=0.2)(attention_layer)
    repeat_vector_layer = RepeatVector(n=X_train_dt.shape[1])(dropout_layer_1)
    RNN_layer_2 = LSTM(units=64, return_sequences=True)(repeat_vector_layer)
    dropout_layer_1 = Dropout(rate=0.2)(RNN_layer_2)
    output = TimeDistributed(Dense(X_train_dt.shape[2], trainable=True))(dropout_layer_1)
    model=Model(x,output)
    model.compile(loss='mae', optimizer='adam')    
    return model

注意我添加的注意力层,attention_layer。在添加这个之前,模型编译完美,但是在添加这个 attention_layer 之后 - 模型抛出以下错误:ValueError: Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1

我的注意力层设置如下:

import keras.backend as K
class attention(Layer):
    def __init__(self,**kwargs):
        super(attention,self).__init__(**kwargs)
 
    def build(self,input_shape):
        self.W=self.add_weight(name='attention_weight', shape=(input_shape[-1],1), 
                               initializer='random_normal', trainable=True)
        self.b=self.add_weight(name='attention_bias', shape=(input_shape[1],1), 
                               initializer='zeros', trainable=True)        
        super(attention, self).build(input_shape)
 
    def call(self,x):
        # Alignment scores. Pass them through tanh function
        e = K.tanh(K.dot(x,self.W)+self.b)
        # Remove dimension of size 1
        e = K.squeeze(e, axis=-1)   
        # Compute the weights
        alpha = K.softmax(e)
        # Reshape to tensorFlow format
        alpha = K.expand_dims(alpha, axis=-1)
        # Compute the context vector
        context = x * alpha
        context = K.sum(context, axis=1)
        return context

注意掩码的想法是让模型像火车一样关注更突出的特征。

为什么会出现上述错误,我该如何解决?

我认为问题出在这一行:

RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)

该层输出一个形状为 (batch_size, 64) 的张量。所以这意味着你输出一个向量然后在w.r.t上运行注意力机制。到批次维度而不是顺序维度。这也意味着您输出的是任何 keras 层都无法接受的压缩批次维度。这就是 Repeat 层引发错误的原因,因为它期望至少形状为 (batch_dimension, dim).

的向量

如果你想运行关注序列上的机制,那么你应该将上面提到的行切换到:

RNN_layer_1 = LSTM(units=64, return_sequences=True)(x)