形状不匹配时如何在keras中使用双向RNN和Conv1D？

Question

我是 Deep-Learning 的新手，所以我正在阅读 Deep Learning with Keras by Antonio Gulli 并学到了很多东西。我想开始使用一些概念。我想尝试实现一个带有一维卷积层的神经网络，该层馈入双向循环层（如下面的论文）。我遇到的所有教程或代码片段都没有实现与此类似的任何远程操作（例如图像识别），也没有使用具有不同功能和用法的旧版本 keras。

我想做的是 this paper 的变体：

(1) 将 DNA 序列转换为 one-hot encoding 向量； ✓

（2）使用一维卷积神经网络； ✓

(3) 最大池化； ✓

(4) 将输出发送到bidirectional RNN； ⓧ

(5) 对输入进行分类；

我不知道如何让形状与 Bidirectional RNN 匹配。我什至不能让一个普通的 RNN 在这个阶段工作。 如何重构传入层以使用双向 RNN？

注意：原始代码来自 https://github.com/uci-cbcl/DanQ/blob/master/DanQ_train.py but I simplified the output layer to just do binary classification. This processed was described (kind of) in https://github.com/fchollet/keras/issues/3322，但我无法让它与更新后的 keras 一起使用。原始代码（和第二个 link）在一个非常大的数据集上工作，所以我生成了一些假数据来说明这个概念。他们还使用旧版本的 keras，此后已对关键功能进行了更改。

# Imports
import tensorflow as tf
import numpy as np
from tensorflow.python.keras._impl.keras.layers.core import *
from tensorflow.python.keras._impl.keras.layers import Conv1D, MaxPooling1D, SimpleRNN, Bidirectional, Input
from tensorflow.python.keras._impl.keras.models import Model, Sequential

# Set up TensorFlow backend
K = tf.keras.backend
K.set_session(tf.Session())
np.random.seed(0) # For keras?

# Constants
NUMBER_OF_POSITIONS = 40
NUMBER_OF_CLASSES = 2
NUMBER_OF_SAMPLES_IN_EACH_CLASS = 25

# Generate sequences
https://pastebin.com/GvfLQte2

# Build model
# ===========
# Input Layer
input_layer = Input(shape=(NUMBER_OF_POSITIONS,4))
# Hidden Layers
y = Conv1D(100, 10, strides=1, activation="relu", )(input_layer)
y = MaxPooling1D(pool_size=5, strides=5)(y)
y = Flatten()(y)
y = Bidirectional(SimpleRNN(100, return_sequences = True, activation="tanh", ))(y)
y = Flatten()(y)
y = Dense(100, activation='relu')(y)
# Output layer
output_layer = Dense(NUMBER_OF_CLASSES, activation="softmax")(y)

model = Model(input_layer, output_layer)
model.compile(optimizer="adam", loss="categorical_crossentropy", )
model.summary()


# ~/anaconda/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/layers/recurrent.py in build(self, input_shape)
#    1049     input_shape = tensor_shape.TensorShape(input_shape).as_list()
#    1050     batch_size = input_shape[0] if self.stateful else None
# -> 1051     self.input_dim = input_shape[2]
#    1052     self.input_spec[0] = InputSpec(shape=(batch_size, None, self.input_dim))
#    1053 

# IndexError: list index out of range

Answer 1

您根本不需要重构任何东西，即可将 Conv1D 层的输出导入 LSTM 层。

所以，问题只是 Flatten 层的存在破坏了形状。

这些是 Conv1D 和 LSTM 使用的形状：

Conv1D：(batch, length, channels)
LSTM：(batch, timeSteps, features)

长度与timeSteps相同，通道与特征相同。

使用 Bidirectional 包装器也不会改变任何事情。它只会复制你的输出特征。

分类。

如果你要将整个序列作为一个整体进行分类，你的最后一个 LSTM 必须使用 return_sequences=False。（或者您可以在之后使用一些 flatten + dense 代替）

如果你要对序列的每一步进行分类，你所有的 LSTM 都应该有 return_sequences=True。你不应该在他们之后展平数据。

形状不匹配时如何在keras中使用双向RNN和Conv1D？

How to use Bidirectional RNN and Conv1D in keras when shapes are not matching?

python

convolution

deep-learning

keras

recurrent-neural-network

分类。