如何向CNN+LSTM网络添加额外的数据
How to add additional data to CNN+LSTM network
我有以下网络(预训练的 CNN + LSTM 来对视频进行分类):
frames, channels, rows, columns = 5,3,224,224
video = Input(shape=(frames,
rows,
columns,
channels))
cnn_base = VGG16(input_shape=(rows,
columns,
channels),
weights="imagenet",
include_top=True) #<=== include_top=True
cnn_base.trainable = False
cnn = Model(cnn_base.input, cnn_base.layers[-3].output, name="VGG_fm") # -3 is the 4096 layer
encoded_frames = TimeDistributed(cnn , name = "encoded_frames")(video)
encoded_sequence = LSTM(256, name = "encoded_seqeunce")(encoded_frames)
hidden_layer = Dense(1024, activation="relu" , name = "hidden_layer")(encoded_sequence)
outputs = Dense(10, activation="softmax")(hidden_layer)
model = Model(video, outputs)
看起来像这样:
现在,我想将视频的 784 个特征的一维向量添加到最后一层。
我试图将最后两行替换为:
encoding_input = keras.Input(shape=(784,), name="Encoding", dtype='float')
sentence_features = layers.Dense(units = 60, name = 'sentence_features')(encoding_input)
x = layers.concatenate([sentence_features, hidden_layer])
outputs = Dense(10, activation="softmax")(x)
但出现错误:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("Sentence-Input-Encoding_3:0", shape=(None, 784), dtype=float32) at layer "sentence_features". The following previous layers were accessed without issue: ['encoded_frames', 'encoded_seqeunce']
任何建议:
您的网络现在有两个输入...不要忘记将它们都传递给您的模型
model = Model([video,encoding_input], outputs)
完整示例
frames, channels, rows, columns = 5,3,224,224
video = Input(shape=(frames,
rows,
columns,
channels))
cnn_base = VGG16(input_shape=(rows,
columns,
channels),
weights="imagenet",
include_top=True)
cnn_base.trainable = False
cnn = Model(cnn_base.input, cnn_base.layers[-3].output, name="VGG_fm")
encoded_frames = TimeDistributed(cnn , name = "encoded_frames")(video)
encoded_sequence = LSTM(256, name = "encoded_seqeunce")(encoded_frames)
hidden_layer = Dense(1024, activation="relu" , name = "hidden_layer")(encoded_sequence)
encoding_input = Input(shape=(784,), name="Encoding", dtype='float')
sentence_features = Dense(units = 60, name = 'sentence_features')(encoding_input)
x = concatenate([sentence_features, hidden_layer])
outputs = Dense(10, activation="softmax")(x)
model = Model([video,encoding_input], outputs) #<=== double input
model.summary()
我有以下网络(预训练的 CNN + LSTM 来对视频进行分类):
frames, channels, rows, columns = 5,3,224,224
video = Input(shape=(frames,
rows,
columns,
channels))
cnn_base = VGG16(input_shape=(rows,
columns,
channels),
weights="imagenet",
include_top=True) #<=== include_top=True
cnn_base.trainable = False
cnn = Model(cnn_base.input, cnn_base.layers[-3].output, name="VGG_fm") # -3 is the 4096 layer
encoded_frames = TimeDistributed(cnn , name = "encoded_frames")(video)
encoded_sequence = LSTM(256, name = "encoded_seqeunce")(encoded_frames)
hidden_layer = Dense(1024, activation="relu" , name = "hidden_layer")(encoded_sequence)
outputs = Dense(10, activation="softmax")(hidden_layer)
model = Model(video, outputs)
看起来像这样:
现在,我想将视频的 784 个特征的一维向量添加到最后一层。 我试图将最后两行替换为:
encoding_input = keras.Input(shape=(784,), name="Encoding", dtype='float')
sentence_features = layers.Dense(units = 60, name = 'sentence_features')(encoding_input)
x = layers.concatenate([sentence_features, hidden_layer])
outputs = Dense(10, activation="softmax")(x)
但出现错误:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("Sentence-Input-Encoding_3:0", shape=(None, 784), dtype=float32) at layer "sentence_features". The following previous layers were accessed without issue: ['encoded_frames', 'encoded_seqeunce']
任何建议:
您的网络现在有两个输入...不要忘记将它们都传递给您的模型
model = Model([video,encoding_input], outputs)
完整示例
frames, channels, rows, columns = 5,3,224,224
video = Input(shape=(frames,
rows,
columns,
channels))
cnn_base = VGG16(input_shape=(rows,
columns,
channels),
weights="imagenet",
include_top=True)
cnn_base.trainable = False
cnn = Model(cnn_base.input, cnn_base.layers[-3].output, name="VGG_fm")
encoded_frames = TimeDistributed(cnn , name = "encoded_frames")(video)
encoded_sequence = LSTM(256, name = "encoded_seqeunce")(encoded_frames)
hidden_layer = Dense(1024, activation="relu" , name = "hidden_layer")(encoded_sequence)
encoding_input = Input(shape=(784,), name="Encoding", dtype='float')
sentence_features = Dense(units = 60, name = 'sentence_features')(encoding_input)
x = concatenate([sentence_features, hidden_layer])
outputs = Dense(10, activation="softmax")(x)
model = Model([video,encoding_input], outputs) #<=== double input
model.summary()