如何使用Keras构建字符级siamise网络
How to build a character-level siamise network using Keras
我正在尝试使用 Keras 在字符级别构建连体神经网络,以了解两个名字是否相似。
所以我的两个 输入 X1 和 X2 是一个 3-D 矩阵:
X[number_of_cases, max_length_of_name, total_number_of_chars_in_DB]
真实案例:
- number_of_cases = 5000
- max_length_of_name = 50
- total_number_of_chars_in_DB = 38
我有一个输出大小为y[number_of_cases].
的二进制矩阵
例如:
print(X1[:3, :2])
将给出以下结果:
[[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]
我使用以下代码构建我的模型:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k
input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))
lstm1 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))
lstm2 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))
l1_norm = lambda x: 1 - k.abs(x[0] - x[1])
merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])
predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)
model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])
model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)
我收到以下错误:
Layer L1_distance was called with an input that isn't a symbolic tensor.
我认为错误是我需要告诉 L1_distance 层使用两个先例 LSTM 层的输出,但我不知道该怎么做。
第二个问题,我是否必须在LSTM之前添加一个嵌入层,即使是在字符级网络的场景中?
谢谢。
您的模型输入为 [input_1, input_2]
,输出为 predictions
。但是 input_1
和 input_2
没有连接到 lstm1
和 lstm2
,所以模型的输入层没有连接到输出层,这就是你得到错误的原因.
试试这个:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k
input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))
lstm1 = Bidirectional(LSTM(256, return_sequences=False))(input_1)
lstm2 = Bidirectional(LSTM(256, return_sequences=False))(input_2)
l1_norm = lambda x: 1 - k.abs(x[0] - x[1])
merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])
predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)
model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])
model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)
我正在尝试使用 Keras 在字符级别构建连体神经网络,以了解两个名字是否相似。
所以我的两个 输入 X1 和 X2 是一个 3-D 矩阵:
X[number_of_cases, max_length_of_name, total_number_of_chars_in_DB]
真实案例:
- number_of_cases = 5000
- max_length_of_name = 50
- total_number_of_chars_in_DB = 38
我有一个输出大小为y[number_of_cases].
的二进制矩阵例如:
print(X1[:3, :2])
将给出以下结果:
[[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]
我使用以下代码构建我的模型:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k
input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))
lstm1 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))
lstm2 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))
l1_norm = lambda x: 1 - k.abs(x[0] - x[1])
merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])
predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)
model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])
model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)
我收到以下错误:
Layer L1_distance was called with an input that isn't a symbolic tensor.
我认为错误是我需要告诉 L1_distance 层使用两个先例 LSTM 层的输出,但我不知道该怎么做。
第二个问题,我是否必须在LSTM之前添加一个嵌入层,即使是在字符级网络的场景中?
谢谢。
您的模型输入为 [input_1, input_2]
,输出为 predictions
。但是 input_1
和 input_2
没有连接到 lstm1
和 lstm2
,所以模型的输入层没有连接到输出层,这就是你得到错误的原因.
试试这个:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k
input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))
lstm1 = Bidirectional(LSTM(256, return_sequences=False))(input_1)
lstm2 = Bidirectional(LSTM(256, return_sequences=False))(input_2)
l1_norm = lambda x: 1 - k.abs(x[0] - x[1])
merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])
predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)
model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])
model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)