CNN RNN 图像集成

CNN RNN integration for images

我正在尝试通过以下代码为 MNIST 图像集成 CNN 和 LSTM:

from __future__ import division, print_function, absolute_import
import tensorflow as tf
import tflearn
import numpy as np
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression

import tflearn.datasets.mnist as mnist
height = 128
width = 128
X, Y, testX, testY = mnist.load_data(one_hot=True)
X = X.reshape([-1, 28, 28, 1])
testX = testX.reshape([-1, 28, 28, 1])

# Building convolutional network
network = tflearn.input_data(shape=[None, 28, 28,1], name='input')
network = tflearn.conv_2d(network, 32, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = tflearn.conv_2d(network, 64, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = fully_connected(network, 128, activation='tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 256, activation='tanh')
network = dropout(network, 0.8)
network = tflearn.reshape(network, [-1, 1, 28*28])
#lstm
network = tflearn.lstm(network, 128, return_seq=True)
network = tflearn.lstm(network, 128)
network = tflearn.fully_connected(network, 10, activation='softmax')
network = tflearn.regression(network, optimizer='adam',
                     loss='categorical_crossentropy', name='target')

#train
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X, Y, n_epoch=1, validation_set=0.1, show_metric=True,snapshot_step=100)

CNN 接受 4D 张量,LSTM 接受 3D 张量。因此,我通过以下方式重塑了网络:network = tflearn.reshape(network, [-1, 1, 28*28])

但是在 运行 上出现错误:

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 16384 values, but the requested shape requires a multiple of 784 [[Node: Reshape/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Dropout_1/cond/Merge, Reshape/Reshape/shape)]]

我不明白为什么他们需要一个大小为 16384 的张量,即使我硬编码 128*128 也还是不行!我根本无法继续。

错误在这一行:

network = tflearn.reshape(network, [-1, 1, 28*28])

之前的FC层有n_units=256,因此无法重塑为28*28。将此行更改为:

network = tflearn.reshape(network, [-1, 1, 256])

请注意,您正在将 CNN 生成的 特征 提供给 LSTM,而不是输入 MNIST 图像