输入 0 与层 repeat_vector_40 不兼容：预期 ndim=2，发现 ndim=1

Question

我正在开发用于异常检测的 LSTM 自动编码器模型。我的 keras 模型设置如下：

from keras.models import Sequential

from keras import Model, layers
from keras.layers import Layer, Conv1D, Input, Masking, Dense, RNN, LSTM, Dropout, RepeatVector, TimeDistributed, Masking, Reshape

def create_RNN_with_attention():
    x=Input(shape=(X_train_dt.shape[1], X_train_dt.shape[2]))
    RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
    attention_layer = attention()(RNN_layer_1)
    dropout_layer_1 = Dropout(rate=0.2)(attention_layer)
    repeat_vector_layer = RepeatVector(n=X_train_dt.shape[1])(dropout_layer_1)
    RNN_layer_2 = LSTM(units=64, return_sequences=True)(repeat_vector_layer)
    dropout_layer_1 = Dropout(rate=0.2)(RNN_layer_2)
    output = TimeDistributed(Dense(X_train_dt.shape[2], trainable=True))(dropout_layer_1)
    model=Model(x,output)
    model.compile(loss='mae', optimizer='adam')    
    return model

注意我添加的注意力层，attention_layer。在添加这个之前，模型编译完美，但是在添加这个 attention_layer 之后 - 模型抛出以下错误：ValueError: Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1

我的注意力层设置如下：

import keras.backend as K
class attention(Layer):
    def __init__(self,**kwargs):
        super(attention,self).__init__(**kwargs)
 
    def build(self,input_shape):
        self.W=self.add_weight(name='attention_weight', shape=(input_shape[-1],1), 
                               initializer='random_normal', trainable=True)
        self.b=self.add_weight(name='attention_bias', shape=(input_shape[1],1), 
                               initializer='zeros', trainable=True)        
        super(attention, self).build(input_shape)
 
    def call(self,x):
        # Alignment scores. Pass them through tanh function
        e = K.tanh(K.dot(x,self.W)+self.b)
        # Remove dimension of size 1
        e = K.squeeze(e, axis=-1)   
        # Compute the weights
        alpha = K.softmax(e)
        # Reshape to tensorFlow format
        alpha = K.expand_dims(alpha, axis=-1)
        # Compute the context vector
        context = x * alpha
        context = K.sum(context, axis=1)
        return context

注意掩码的想法是让模型像火车一样关注更突出的特征。

为什么会出现上述错误，我该如何解决？

Answer 1

我认为问题出在这一行：

RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)

该层输出一个形状为 (batch_size, 64) 的张量。所以这意味着你输出一个向量然后在w.r.t上运行注意力机制。到批次维度而不是顺序维度。这也意味着您输出的是任何 keras 层都无法接受的压缩批次维度。这就是 Repeat 层引发错误的原因，因为它期望至少形状为 (batch_dimension, dim).

的向量

如果你想运行关注序列上的机制，那么你应该将上面提到的行切换到：

RNN_layer_1 = LSTM(units=64, return_sequences=True)(x)

输入 0 与层 repeat_vector_40 不兼容：预期 ndim=2，发现 ndim=1

Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1

python

lstm

keras

tensorflow

attention-model