tflearn 中的 CNN 输出回归

CNN output regression in tflearn

我正在研究自动驾驶汽车。我想在 tflearn 中使用 CNN 从图片预测转向角。问题是它只输出 0.1。你认为问题是什么?图片为 128x128,但我已尝试将它们调整为 28x28,以便我可以使用 mnist 示例中的代码。标签是0到180之间的转向角。我也可以说训练时损失并没有变小。

Training.py

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tflearn.datasets.mnist as mnist
import numpy
from scipy import misc
import csv

nrOfFiles = 0
csv_list = []

with open('/Users/gustavoskarsson/Desktop/car/csvfile.csv', 'r') as f:
    reader = csv.reader(f)
    csv_list = list(reader)

nrOfFiles = len(csv_list)

pics = []
face = misc.face()
for i in range(0, nrOfFiles):
    face = misc.imread('/Users/gustavoskarsson/Desktop/car/pics/' + str(i) + '.jpg')
    face = misc.imresize(face[:,:,0], (28, 28))
    pics.append(face)

X = numpy.array(pics)


steer = []
throt = []
for i in range(0, nrOfFiles):
    steer.append(csv_list[i][1])
    throt.append(csv_list[i][2])

#y__ = numpy.array([steer, throt])
Y = numpy.array(steer)
Y = Y.reshape(-1, 1)
#Strunta i gasen till att börja med.


convnet = input_data(shape=[None, 28, 28, 1], name='input')

convnet = conv_2d(convnet, 32, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = conv_2d(convnet, 64, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)

convnet = fully_connected(convnet, 1, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=0.01, loss='mean_square', name='targets')

model = tflearn.DNN(convnet)
model.fit(X, Y, n_epoch=6, batch_size=10, show_metric=True)
model.save('mod.model')

Predict.py

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tflearn.datasets.mnist as mnist
import numpy
from scipy import misc


convnet = input_data(shape=[None, 28, 28, 1], name='input')
                           #[none, 28,  28, 1]

convnet = conv_2d(convnet, 32, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = conv_2d(convnet, 64, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)

convnet = fully_connected(convnet, 1, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=0.01, loss='mean_square', name='targets')

model = tflearn.DNN(convnet)
model.load('mod.model')

#load test image
face = misc.face()
pics = []
for i in range(0, 3):
    face = misc.imread('/Users/gustavoskarsson/Desktop/car/pics/' + str(i) + '.jpg')
    face = misc.imresize(face[:,:,0], (28, 28))
    pics.append(face) 

test_x = numpy.array(pics)
test_x = test_x.reshape([-1, 28, 28, 1])
print(model.predict([test_x[0]]))

问题可能出在你的输出层。它使用 softmax 激活函数,该函数始终产生 0-1 的输出。

如果你看一下softmax function definition you will see that it depends on every output node of your layer. Since you have only one output node, it should always return 1, since you are dividing the output by its own value. If you want to learn more about the softmax layers, check out Michael Nielsen's great free book on Neural Networks

此外,如果您不想对事物进行分类,softmax 函数也不是一个好的选择。

尝试在最后一个全连接层中省略 activation='softmax'

您覆盖了具有卷积网络的修道院变量 对于每一层。您还应该在每一层中进行采样。 你的代码应该是这样的:

    x = tf.reshape(x, shape=[-1, 28, 28, 1])

    # Convolution Layer with 32 filters and a kernel size of 5
    conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)

    # Max Pooling (down-sampling) with strides of 2 and kernel size of 2
    conv1 = tf.layers.max_pooling2d(conv1, 2, 2)

    # Convolution Layer with 64 filters and a kernel size of 3
    conv2 = tf.layers.conv2d(conv1, 64, 3, activation=tf.nn.relu)

    # Max Pooling (down-sampling) with strides of 2 and kernel size of 2
    conv2 = tf.layers.max_pooling2d(conv2, 2, 2)

您还可以看到 here