为什么我必须在 CNN 中将一张图像重塑为 [n,height,width,channel]

Question

我尝试将卷积层应用于形状为 [256,256,3] 的图片 a 当我直接使用图像的张量时出现错误

conv1 = conv2d(input,W_conv1) +b_conv1  #<=== error

错误信息：

ValueError: Shape must be rank 4 but is rank 3 for 'Conv2D' (op: 'Conv2D') 
with input shapes: [256,256,3], [3,3,3,1].

但是当我重塑函数 conv2d 时正常工作

x_image = tf.reshape(input,[-1,256,256,3])
conv1 = conv2d(x_image,W_conv1) +b_conv1

如果我必须重塑张量，在我的情况下重塑的最佳值是什么？为什么？

import tensorflow as tf
import numpy as np
from PIL import Image

def img_to_tensor(img) :
    return tf.convert_to_tensor(img, np.float32)

def weight_generater(shape):
    return tf.Variable(tf.truncated_normal(shape,stddev=0.1))

def bias_generater(shape):
    return tf.Variable(tf.constant(.1,shape=shape))

def conv2d(x,W):
    return tf.nn.conv2d(x,W,[1,1,1,1],'SAME')

def pool_max_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='SAME')

#read image
img = Image.open("img.tif")

sess = tf.InteractiveSession()

#convetir image to tensor
input = img_to_tensor(img).eval()
#print(input)

# get img dimension
img_dimension = tf.shape(input).eval()
print(img_dimension)

height,width,channel=img_dimension
filter_size = 3
feature_map = 32

x = tf.placeholder(tf.float32,shape=[height*width*channel])
y = tf.placeholder(tf.float32,shape=21)

# generate weigh [kernal size, kernal size,channel,number of filters]
W_conv1 = weight_generater([filter_size,filter_size,channel,1])

#for each filter W has his  specific bais
b_conv1 = bias_generater([feature_map])

""" I must reshape the picture
x_image = tf.reshape(input,[-1,256,256,3])
"""
conv1 = conv2d(input,W_conv1) +b_conv1  #<=== error

h_conv1 = tf.nn.relu(conv1)

h_pool1 = pool_max_2x2(h_conv1)

layer1_dimension = tf.shape(h_pool1).eval()

print(layer1_dimension)

Answer 1

第一个维度是批量大小。如果您一次输入一张图像，您只需将第一个维度设置为 1，它不会更改您的数据，只需将索引更改为 4D:

x_image = tf.reshape(input, [1, 256, 256, 3])

如果您在第一个维度中用 -1 重塑它，您所做的就是说您将输入一批 4D 图像（形状为 [batch_size, height, width, color_channels]，并且您允许这批图像大小是动态的（这很常见）。

为什么我必须在 CNN 中将一张图像重塑为 [n,height,width,channel]

why I must reshape one image to [n,height,width,channel] in CNN

python

reshape

conv-neural-network

tensorflow

tensor