如何正确使用 Keras Embedding 层?
How do I correctly use Keras Embedding layer?
我编写了以下多输入 Keras TensorFlow 模型:
CHARPROTLEN = 25 #size of vocab
CHARCANSMILEN = 62 #size of vocab
protein_input = Input(shape=(train_protein.shape[1:]))
compound_input = Input(shape=(train_smile.shape[1:]))
#protein layers
x = Embedding(input_dim=CHARPROTLEN+1,output_dim=128, input_length=maximum_amino_acid_sequence_length) (protein_input)
x = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(x)
x = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=8)(x)
x = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=12)(x)
final_protein = GlobalMaxPooling1D()(x)
#compound layers
y = Embedding(input_dim=CHARCANSMISET+1,output_dim=128, input_length=maximum_SMILES_length) (compound_input)
y = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(y)
y = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=6)(y)
y = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=8)(y)
final_compound = GlobalMaxPooling1D()(y)
join = tf.keras.layers.concatenate([final_protein, final_compound], axis=-1)
x = Dense(1024, activation="relu")(join)
x = Dropout(0.1)(x)
x = Dense(1024, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(512, activation='relu')(x)
predictions = Dense(1,kernel_initializer='normal')(x)
model = Model(inputs=[protein_input, compound_input], outputs=[predictions])
输入具有以下形状:
train_protein.shape
TensorShape([5411, 1500, 1])
train_smile.shape
TensorShape([5411, 100, 1])
我收到以下错误消息:
ValueError: One of the dimensions in the output is <= 0 due to downsampling in conv1d. Consider increasing the input size. Received input shape [None, 1500, 1, 128] which would produce output shape with a zero or negative value in a dimension.
这是因为 Embedding
图层的 output_dim
不正确吗?我该如何纠正这个问题?谢谢。
Conv1D
层需要输入形状 (batch_size, timesteps, features)
,train_protein
和 train_smile
已经有了。例如,train_protein
包含 5411 个样本,其中每个样本有 1500 个时间步长,每个时间步长一个特征。对它们应用 Embedding
层会导致添加额外的维度,Conv1D
层无法使用。
你有两个选择。您要么完全省略 Embedding
层并将输入直接提供给 Conv1D
层,要么将数据重塑为 (5411, 1500)
for train_protein
和 (5411, 100)
train_smile
。您可以使用 tf.reshape
、tf.squeeze
或 tf.keras.layers.Reshape
来重塑数据。之后您可以按计划使用 Embedding
图层。并注意 output_dim
确定每个时间步将映射到的 n 维向量。另见 and .
我编写了以下多输入 Keras TensorFlow 模型:
CHARPROTLEN = 25 #size of vocab
CHARCANSMILEN = 62 #size of vocab
protein_input = Input(shape=(train_protein.shape[1:]))
compound_input = Input(shape=(train_smile.shape[1:]))
#protein layers
x = Embedding(input_dim=CHARPROTLEN+1,output_dim=128, input_length=maximum_amino_acid_sequence_length) (protein_input)
x = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(x)
x = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=8)(x)
x = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=12)(x)
final_protein = GlobalMaxPooling1D()(x)
#compound layers
y = Embedding(input_dim=CHARCANSMISET+1,output_dim=128, input_length=maximum_SMILES_length) (compound_input)
y = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(y)
y = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=6)(y)
y = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=8)(y)
final_compound = GlobalMaxPooling1D()(y)
join = tf.keras.layers.concatenate([final_protein, final_compound], axis=-1)
x = Dense(1024, activation="relu")(join)
x = Dropout(0.1)(x)
x = Dense(1024, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(512, activation='relu')(x)
predictions = Dense(1,kernel_initializer='normal')(x)
model = Model(inputs=[protein_input, compound_input], outputs=[predictions])
输入具有以下形状:
train_protein.shape
TensorShape([5411, 1500, 1])
train_smile.shape
TensorShape([5411, 100, 1])
我收到以下错误消息:
ValueError: One of the dimensions in the output is <= 0 due to downsampling in conv1d. Consider increasing the input size. Received input shape [None, 1500, 1, 128] which would produce output shape with a zero or negative value in a dimension.
这是因为 Embedding
图层的 output_dim
不正确吗?我该如何纠正这个问题?谢谢。
Conv1D
层需要输入形状 (batch_size, timesteps, features)
,train_protein
和 train_smile
已经有了。例如,train_protein
包含 5411 个样本,其中每个样本有 1500 个时间步长,每个时间步长一个特征。对它们应用 Embedding
层会导致添加额外的维度,Conv1D
层无法使用。
你有两个选择。您要么完全省略 Embedding
层并将输入直接提供给 Conv1D
层,要么将数据重塑为 (5411, 1500)
for train_protein
和 (5411, 100)
train_smile
。您可以使用 tf.reshape
、tf.squeeze
或 tf.keras.layers.Reshape
来重塑数据。之后您可以按计划使用 Embedding
图层。并注意 output_dim
确定每个时间步将映射到的 n 维向量。另见