如何为 biLSTM 层 Keras 设置自定义初始权重?
How to set custom initial weights to biLSTM layer Keras?
我目前正在构建带有注意力的 BiLSTM,并使用 Antlion 算法优化 BiLSTM 层权重。 Antlion 算法在 MATLAB 代码中,我能够集成 Python 和 MATLAB 以接收优化的权重,如下所示:
#LSTM hidden nodes
hidden_nodes=11
import matlab.engine
eng = matlab.engine.start_matlab()
#call optimised_weights.m
[forward_kernel, backward_kernel,forward_recurrent, backward_recurrent]=eng.optimised_weights(int(hidden_nodes),nargout=4)
eng.quit()
## convert to nparray
forward_kernel=np.array(forward_kernel)
backward_kernel=np.array(backward_kernel)
forward_recurrent=np.array(forward_recurrent)
backward_recurrent=np.array(backward_recurrent)
我目前在为 BiLSTM 层设置权重和偏差时遇到问题,如下面的模型(未设置自定义初始权重):
class attention(Layer):
def __init__(self, return_sequences=True,**kwargs):
self.return_sequences = return_sequences
super(attention,self).__init__()
def build(self, input_shape):
self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
initializer="normal")
self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
initializer="zeros")
super(attention,self).build(input_shape)
def call(self, x):
e = K.tanh(K.dot(x,self.W)+self.b)
a = K.softmax(e, axis=1)
output = x*a
if self.return_sequences:
return output
return K.sum(output, axis=1)
def get_config(self):
# For serialization with 'custom_objects'
config = super().get_config()
config['return_sequences'] = self.return_sequences
return config
model = Sequential()
model.add(Input(shape=(5,1)))
model.add(Bidirectional(LSTM(hidden_nodes, return_sequences=True)))
model.add(attention(return_sequences=False)) #this is a custom layer...
model.add(Dense(104, activation="sigmoid"))
model.add(Dropout(0.2))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=tf.keras.optimizers.Adam(epsilon=1e-08,learning_rate=0.01),loss='mse')
es = EarlyStopping(monitor='val_loss', mode='min', verbose=2, patience=50)
mc = ModelCheckpoint('model.h5', monitor='val_loss',
mode='min', verbose=2, save_best_only=True)
我试过以下方法:
model.add(Bidirectional(LSTM(hidden_nodes, return_sequences=True,
weights=[forward_kernel,forward_recurrent,np.zeros(20,),backward_kernel,backward_recurrent,np.zeros(20,)])))
但是一旦模型被编译,权重和偏差就会改变...即使内核、循环和偏差初始化设置为None...
我已经提到这个 link:https://keras.io/api/layers/initializers/ 但无法将其与我的问题联系起来...
如果你们能提供解决此问题的见解,并且我遗漏了任何基本部分,我将不胜感激。如果需要,我很乐意分享更多详细信息。
谢谢!
使用 tf.constant_initializer
将您的自定义权重提供为 np.array
。此外,由于您使用的是 Bidirectional
层,因此您需要明确指定具有自定义权重的后向层。
layer = Bidirectional(
LSTM(
hidden_nodes,
return_sequences=True,
kernel_initializer=tf.constant_initializer(forward_kernel),
recurrent_initializer=tf.constant_initializer(forward_recurrent),
),
backward_layer=LSTM(
hidden_nodes,
return_sequences=True,
kernel_initializer=tf.constant_initializer(backward_kernel),
recurrent_initializer=tf.constant_initializer(backward_recurrent),
go_backwards=True,
),
)
注意权重的预期形状。由于层的输入是 (batch, timesteps, features)
,您的权重应具有以下形状(考虑 LSTM 单元中的 4 个门):
- 内核:
(features, 4*hidden_nodes)
- 经常性:
(hidden_nodes, 4*hidden_nodes)
- 偏差:
(4*hidden_nodes)
我目前正在构建带有注意力的 BiLSTM,并使用 Antlion 算法优化 BiLSTM 层权重。 Antlion 算法在 MATLAB 代码中,我能够集成 Python 和 MATLAB 以接收优化的权重,如下所示:
#LSTM hidden nodes
hidden_nodes=11
import matlab.engine
eng = matlab.engine.start_matlab()
#call optimised_weights.m
[forward_kernel, backward_kernel,forward_recurrent, backward_recurrent]=eng.optimised_weights(int(hidden_nodes),nargout=4)
eng.quit()
## convert to nparray
forward_kernel=np.array(forward_kernel)
backward_kernel=np.array(backward_kernel)
forward_recurrent=np.array(forward_recurrent)
backward_recurrent=np.array(backward_recurrent)
我目前在为 BiLSTM 层设置权重和偏差时遇到问题,如下面的模型(未设置自定义初始权重):
class attention(Layer):
def __init__(self, return_sequences=True,**kwargs):
self.return_sequences = return_sequences
super(attention,self).__init__()
def build(self, input_shape):
self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
initializer="normal")
self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
initializer="zeros")
super(attention,self).build(input_shape)
def call(self, x):
e = K.tanh(K.dot(x,self.W)+self.b)
a = K.softmax(e, axis=1)
output = x*a
if self.return_sequences:
return output
return K.sum(output, axis=1)
def get_config(self):
# For serialization with 'custom_objects'
config = super().get_config()
config['return_sequences'] = self.return_sequences
return config
model = Sequential()
model.add(Input(shape=(5,1)))
model.add(Bidirectional(LSTM(hidden_nodes, return_sequences=True)))
model.add(attention(return_sequences=False)) #this is a custom layer...
model.add(Dense(104, activation="sigmoid"))
model.add(Dropout(0.2))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=tf.keras.optimizers.Adam(epsilon=1e-08,learning_rate=0.01),loss='mse')
es = EarlyStopping(monitor='val_loss', mode='min', verbose=2, patience=50)
mc = ModelCheckpoint('model.h5', monitor='val_loss',
mode='min', verbose=2, save_best_only=True)
我试过以下方法:
model.add(Bidirectional(LSTM(hidden_nodes, return_sequences=True,
weights=[forward_kernel,forward_recurrent,np.zeros(20,),backward_kernel,backward_recurrent,np.zeros(20,)])))
但是一旦模型被编译,权重和偏差就会改变...即使内核、循环和偏差初始化设置为None...
我已经提到这个 link:https://keras.io/api/layers/initializers/ 但无法将其与我的问题联系起来...
如果你们能提供解决此问题的见解,并且我遗漏了任何基本部分,我将不胜感激。如果需要,我很乐意分享更多详细信息。
谢谢!
使用 tf.constant_initializer
将您的自定义权重提供为 np.array
。此外,由于您使用的是 Bidirectional
层,因此您需要明确指定具有自定义权重的后向层。
layer = Bidirectional(
LSTM(
hidden_nodes,
return_sequences=True,
kernel_initializer=tf.constant_initializer(forward_kernel),
recurrent_initializer=tf.constant_initializer(forward_recurrent),
),
backward_layer=LSTM(
hidden_nodes,
return_sequences=True,
kernel_initializer=tf.constant_initializer(backward_kernel),
recurrent_initializer=tf.constant_initializer(backward_recurrent),
go_backwards=True,
),
)
注意权重的预期形状。由于层的输入是 (batch, timesteps, features)
,您的权重应具有以下形状(考虑 LSTM 单元中的 4 个门):
- 内核:
(features, 4*hidden_nodes)
- 经常性:
(hidden_nodes, 4*hidden_nodes)
- 偏差:
(4*hidden_nodes)