多维张量作为 tensorflow 中 rnn 的输入 (tf.contrib.rnn.RNNCell)

Multi-dimensional tensors as input to rnn in tensorflow (tf.contrib.rnn.RNNCell)

python
computer-vision
deep-learning
tensorflow
rnn

来自关于 tf.contrib.rnn.RNNCell 的 tensorflow 文档："This definition of cell differs from the definition used in the literature. In the literature, 'cell' refers to an object with a single scalar output. This definition refers to a horizontal array of such units."

看来，rnn cell 只接受向量作为输入。但是我想将 images/videos 提供给 rnn（例如 [批量大小、步长、高度、宽度、通道]）。有没有办法使用 rnn 单元和动态 rnn 来做到这一点，还是我必须手动构建一个 rnn？

正如您所说，RNN 仅接受像 [batch_size、sequence_lentgh、特征] 这样的张量作为输入。

为了使用 tensorflow 中的 RNN，您必须使用 CNN 为每一帧提取特征，并将 CNN 输出数据转换为遵循 [batch_size、sequence_lentgh、特征的张量] 形状以便将其提供给 RNN。

多维张量作为 tensorflow 中 rnn 的输入 (tf.contrib.rnn.RNNCell)

Multi-dimensional tensors as input to rnn in tensorflow (tf.contrib.rnn.RNNCell)

python

computer-vision

deep-learning

tensorflow

rnn