输入 0 与层 repeat_vector_40 不兼容:预期 ndim=2,发现 ndim=1
Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1
我正在开发用于异常检测的 LSTM 自动编码器模型。我的 keras 模型设置如下:
from keras.models import Sequential
from keras import Model, layers
from keras.layers import Layer, Conv1D, Input, Masking, Dense, RNN, LSTM, Dropout, RepeatVector, TimeDistributed, Masking, Reshape
def create_RNN_with_attention():
x=Input(shape=(X_train_dt.shape[1], X_train_dt.shape[2]))
RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
attention_layer = attention()(RNN_layer_1)
dropout_layer_1 = Dropout(rate=0.2)(attention_layer)
repeat_vector_layer = RepeatVector(n=X_train_dt.shape[1])(dropout_layer_1)
RNN_layer_2 = LSTM(units=64, return_sequences=True)(repeat_vector_layer)
dropout_layer_1 = Dropout(rate=0.2)(RNN_layer_2)
output = TimeDistributed(Dense(X_train_dt.shape[2], trainable=True))(dropout_layer_1)
model=Model(x,output)
model.compile(loss='mae', optimizer='adam')
return model
注意我添加的注意力层,attention_layer
。在添加这个之前,模型编译完美,但是在添加这个 attention_layer 之后 - 模型抛出以下错误:ValueError: Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1
我的注意力层设置如下:
import keras.backend as K
class attention(Layer):
def __init__(self,**kwargs):
super(attention,self).__init__(**kwargs)
def build(self,input_shape):
self.W=self.add_weight(name='attention_weight', shape=(input_shape[-1],1),
initializer='random_normal', trainable=True)
self.b=self.add_weight(name='attention_bias', shape=(input_shape[1],1),
initializer='zeros', trainable=True)
super(attention, self).build(input_shape)
def call(self,x):
# Alignment scores. Pass them through tanh function
e = K.tanh(K.dot(x,self.W)+self.b)
# Remove dimension of size 1
e = K.squeeze(e, axis=-1)
# Compute the weights
alpha = K.softmax(e)
# Reshape to tensorFlow format
alpha = K.expand_dims(alpha, axis=-1)
# Compute the context vector
context = x * alpha
context = K.sum(context, axis=1)
return context
注意掩码的想法是让模型像火车一样关注更突出的特征。
为什么会出现上述错误,我该如何解决?
我认为问题出在这一行:
RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
该层输出一个形状为 (batch_size, 64)
的张量。所以这意味着你输出一个向量然后在w.r.t上运行注意力机制。到批次维度而不是顺序维度。这也意味着您输出的是任何 keras
层都无法接受的压缩批次维度。这就是 Repeat
层引发错误的原因,因为它期望至少形状为 (batch_dimension, dim)
.
的向量
如果你想运行关注序列上的机制,那么你应该将上面提到的行切换到:
RNN_layer_1 = LSTM(units=64, return_sequences=True)(x)
我正在开发用于异常检测的 LSTM 自动编码器模型。我的 keras 模型设置如下:
from keras.models import Sequential
from keras import Model, layers
from keras.layers import Layer, Conv1D, Input, Masking, Dense, RNN, LSTM, Dropout, RepeatVector, TimeDistributed, Masking, Reshape
def create_RNN_with_attention():
x=Input(shape=(X_train_dt.shape[1], X_train_dt.shape[2]))
RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
attention_layer = attention()(RNN_layer_1)
dropout_layer_1 = Dropout(rate=0.2)(attention_layer)
repeat_vector_layer = RepeatVector(n=X_train_dt.shape[1])(dropout_layer_1)
RNN_layer_2 = LSTM(units=64, return_sequences=True)(repeat_vector_layer)
dropout_layer_1 = Dropout(rate=0.2)(RNN_layer_2)
output = TimeDistributed(Dense(X_train_dt.shape[2], trainable=True))(dropout_layer_1)
model=Model(x,output)
model.compile(loss='mae', optimizer='adam')
return model
注意我添加的注意力层,attention_layer
。在添加这个之前,模型编译完美,但是在添加这个 attention_layer 之后 - 模型抛出以下错误:ValueError: Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1
我的注意力层设置如下:
import keras.backend as K
class attention(Layer):
def __init__(self,**kwargs):
super(attention,self).__init__(**kwargs)
def build(self,input_shape):
self.W=self.add_weight(name='attention_weight', shape=(input_shape[-1],1),
initializer='random_normal', trainable=True)
self.b=self.add_weight(name='attention_bias', shape=(input_shape[1],1),
initializer='zeros', trainable=True)
super(attention, self).build(input_shape)
def call(self,x):
# Alignment scores. Pass them through tanh function
e = K.tanh(K.dot(x,self.W)+self.b)
# Remove dimension of size 1
e = K.squeeze(e, axis=-1)
# Compute the weights
alpha = K.softmax(e)
# Reshape to tensorFlow format
alpha = K.expand_dims(alpha, axis=-1)
# Compute the context vector
context = x * alpha
context = K.sum(context, axis=1)
return context
注意掩码的想法是让模型像火车一样关注更突出的特征。
为什么会出现上述错误,我该如何解决?
我认为问题出在这一行:
RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
该层输出一个形状为 (batch_size, 64)
的张量。所以这意味着你输出一个向量然后在w.r.t上运行注意力机制。到批次维度而不是顺序维度。这也意味着您输出的是任何 keras
层都无法接受的压缩批次维度。这就是 Repeat
层引发错误的原因,因为它期望至少形状为 (batch_dimension, dim)
.
如果你想运行关注序列上的机制,那么你应该将上面提到的行切换到:
RNN_layer_1 = LSTM(units=64, return_sequences=True)(x)