预训练的 BERT 不是 LSTM 层的正确形状:值错误,新数组的总大小必须不变
Pre-trained BERT not the right shape for LSTM layer: Value Error, total size of new array must be unchanged
我正在尝试在 Siamese 神经网络上使用预训练的 BERT 模型。但是,我在将 BERT 模型传递到共享 LSTM 层时遇到问题。我遇到以下错误:
ValueError: Exception encountered when calling layer "reshape_4" (type Reshape).
total size of new array must be unchanged, input_shape = [768], output_shape = [64, 768, 1]
Call arguments received:
• inputs=tf.Tensor(shape=(None, 768), dtype=float32)
我在其他几篇文章中读到,我输入 LSTM 的维度应该是 [batch_size, 768, 1]
。但是,当我尝试重塑时,我 运行 进入了错误。我该如何解决这个错误?
input_1 = Input(shape=(), dtype=tf.string, name='text')
preprocessed_text_1 = bert_preprocess(input_1)
outputs_1 = bert_encoder(preprocessed_text_1)
e1 = tf.keras.layers.Reshape((64, 768, 1))(outputs_1['pooled_output'])
input_2 = Input(shape=(), dtype=tf.string, name='text')
preprocessed_text_2 = bert_preprocess(input_2)
outputs_2 = bert_encoder(preprocessed_text_2)
e2 = Reshape((64, 768, 1))(outputs_2['pooled_output'])
lstm_layer = Bidirectional(LSTM(50, dropout=0.2, recurrent_dropout=0.2)) # Won't work on GPU
x1 = lstm_layer(e1)
x2 = lstm_layer(e2)
mhd = lambda x: exponent_neg_cosine_distance(x[0], x[1])
merged = Lambda(function=mhd, output_shape=lambda x: x[0], name='cosine_distance')([x1, x2])
preds = Dense(1, activation='sigmoid')(merged)
model = Model(inputs=[input_1, input_2], outputs=preds)
您必须从 Reshape 图层中删除批量大小 (=64)。
我正在尝试在 Siamese 神经网络上使用预训练的 BERT 模型。但是,我在将 BERT 模型传递到共享 LSTM 层时遇到问题。我遇到以下错误:
ValueError: Exception encountered when calling layer "reshape_4" (type Reshape).
total size of new array must be unchanged, input_shape = [768], output_shape = [64, 768, 1]
Call arguments received:
• inputs=tf.Tensor(shape=(None, 768), dtype=float32)
我在其他几篇文章中读到,我输入 LSTM 的维度应该是 [batch_size, 768, 1]
。但是,当我尝试重塑时,我 运行 进入了错误。我该如何解决这个错误?
input_1 = Input(shape=(), dtype=tf.string, name='text')
preprocessed_text_1 = bert_preprocess(input_1)
outputs_1 = bert_encoder(preprocessed_text_1)
e1 = tf.keras.layers.Reshape((64, 768, 1))(outputs_1['pooled_output'])
input_2 = Input(shape=(), dtype=tf.string, name='text')
preprocessed_text_2 = bert_preprocess(input_2)
outputs_2 = bert_encoder(preprocessed_text_2)
e2 = Reshape((64, 768, 1))(outputs_2['pooled_output'])
lstm_layer = Bidirectional(LSTM(50, dropout=0.2, recurrent_dropout=0.2)) # Won't work on GPU
x1 = lstm_layer(e1)
x2 = lstm_layer(e2)
mhd = lambda x: exponent_neg_cosine_distance(x[0], x[1])
merged = Lambda(function=mhd, output_shape=lambda x: x[0], name='cosine_distance')([x1, x2])
preds = Dense(1, activation='sigmoid')(merged)
model = Model(inputs=[input_1, input_2], outputs=preds)
您必须从 Reshape 图层中删除批量大小 (=64)。