具有可变大小图像的二维卷积神经网络
2D Convolutional neural networks with variable size images
我已经使用 Theano 后端通过 Keras 实现了一个卷积自动编码器。我正在改变我的方法来尝试处理不同尺寸的图像。只要我使用 numpy 的 stack
函数来构建数据集(相同大小的图像),我就是黄金。但是,对于不同大小的图像,我们不能使用 stack
,而 fit
需要一个 numpy 数组。所以我改为 fit_generator
以避免大小检查。问题是最后一层期望 16 作为输入的最后一个维度,我不明白为什么它得到原始图像的维度。
看看下面的代码和错误输出。
import numpy as np
import keras
from keras.models import Sequential, Model
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
AE_EPOCHS = 10
VERB = 1
batchsz = 16
outfun = 'sigmoid'
data = []
dimensions = [(10, 15), (12, 15), (7,15), (20,15), (25,15)]
for d in dimensions:
dd = np.random.rand(*d)
dd = dd.reshape((1,)+dd.shape)
data.append(dd)
input_img = Input(shape=(1, None, 15))
filtersz = 3
pad_it = 'same'
size1 = 16
size2 = 8
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(input_img)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
encoded = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(encoded)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
decoded = Conv2D(1, (filtersz, filtersz), activation=outfun, padding=pad_it)(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss= 'binary_crossentropy')
x_train = data[1:]
x_test= data[0].reshape((1,)+ data[0].shape)
def mygen(xx, *args, **kwargs):
for i in xx:
yield (i,i)
thegen = mygen(x_train)
#If I use this generator somehow None is returned so it is not used
thegenval = mygen(np.array([x_test]))
hist = autoencoder.fit_generator(thegen,
epochs=AE_EPOCHS,
steps_per_epoch=4,
verbose=VERB,
validation_data=(x_test, x_test),
validation_steps=1
)
Traceback (most recent call last):
File "stacko.py", line 107, in
validation_steps=1
File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 88, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1847, in fit_generator
val_x, val_y, val_sample_weight)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1315, in _standardize_user_data
exception_prefix='target')
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 139, in _standardize_input_data
str(array.shape))
ValueError: Error when checking target: expected conv2d_7 to have shape (None, 1, None, 16) but got array with shape (1, 1, 10, 15)
上面的代码有两个问题:首先,图像轴的大小必须是每层最小过滤器数量(在本例中为 8)的倍数;其次,fit_generator
的生成器必须 return 批次(4D numpy 数组)。
生成器是用 itertools.cycle
实现的,并将图形重塑为一个样本批次(如果使用多个具有相同尺寸的图像,则可以为每组维度设置可变大小的批次)。工作示例如下。
import numpy as np
from itertools import cycle
import keras
from keras.models import Sequential, Model
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
AE_EPOCHS = 10
VERB = 1
outfun = 'sigmoid'
data = []
dimensions = [(16, 32), (24, 32), (8,32), (32,32)]
for d in dimensions:
dd = np.random.rand(*d)
dd = dd.reshape((1,)+dd.shape)
data.append(dd)
input_img = Input(shape=(1, None, 32))
filtersz = 3
pad_it = 'same'
size1 = 16
size2 = 8
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(input_img)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
encoded = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(encoded)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
decoded = Conv2D(1, (filtersz, filtersz), activation=outfun, padding=pad_it)(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss= 'binary_crossentropy')
x_train = data[1:]
x_test= [data[0]]
def mygen(xx, *args, **kwargs):
for i in cycle(xx):
ii = i.reshape((1,)+i.shape)
yield ii,ii
thegen = mygen(x_train)
thegenval = mygen(x_test)
hist = autoencoder.fit_generator(
thegen,
epochs=AE_EPOCHS,
steps_per_epoch=3,
verbose=VERB,
validation_data=thegenval,
validation_steps=1
)
我已经使用 Theano 后端通过 Keras 实现了一个卷积自动编码器。我正在改变我的方法来尝试处理不同尺寸的图像。只要我使用 numpy 的 stack
函数来构建数据集(相同大小的图像),我就是黄金。但是,对于不同大小的图像,我们不能使用 stack
,而 fit
需要一个 numpy 数组。所以我改为 fit_generator
以避免大小检查。问题是最后一层期望 16 作为输入的最后一个维度,我不明白为什么它得到原始图像的维度。
看看下面的代码和错误输出。
import numpy as np
import keras
from keras.models import Sequential, Model
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
AE_EPOCHS = 10
VERB = 1
batchsz = 16
outfun = 'sigmoid'
data = []
dimensions = [(10, 15), (12, 15), (7,15), (20,15), (25,15)]
for d in dimensions:
dd = np.random.rand(*d)
dd = dd.reshape((1,)+dd.shape)
data.append(dd)
input_img = Input(shape=(1, None, 15))
filtersz = 3
pad_it = 'same'
size1 = 16
size2 = 8
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(input_img)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
encoded = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(encoded)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
decoded = Conv2D(1, (filtersz, filtersz), activation=outfun, padding=pad_it)(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss= 'binary_crossentropy')
x_train = data[1:]
x_test= data[0].reshape((1,)+ data[0].shape)
def mygen(xx, *args, **kwargs):
for i in xx:
yield (i,i)
thegen = mygen(x_train)
#If I use this generator somehow None is returned so it is not used
thegenval = mygen(np.array([x_test]))
hist = autoencoder.fit_generator(thegen,
epochs=AE_EPOCHS,
steps_per_epoch=4,
verbose=VERB,
validation_data=(x_test, x_test),
validation_steps=1
)
Traceback (most recent call last):
File "stacko.py", line 107, in validation_steps=1
File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 88, in wrapper return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1847, in fit_generator val_x, val_y, val_sample_weight)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1315, in _standardize_user_data exception_prefix='target')
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 139, in _standardize_input_data str(array.shape))
ValueError: Error when checking target: expected conv2d_7 to have shape (None, 1, None, 16) but got array with shape (1, 1, 10, 15)
上面的代码有两个问题:首先,图像轴的大小必须是每层最小过滤器数量(在本例中为 8)的倍数;其次,fit_generator
的生成器必须 return 批次(4D numpy 数组)。
生成器是用 itertools.cycle
实现的,并将图形重塑为一个样本批次(如果使用多个具有相同尺寸的图像,则可以为每组维度设置可变大小的批次)。工作示例如下。
import numpy as np
from itertools import cycle
import keras
from keras.models import Sequential, Model
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
AE_EPOCHS = 10
VERB = 1
outfun = 'sigmoid'
data = []
dimensions = [(16, 32), (24, 32), (8,32), (32,32)]
for d in dimensions:
dd = np.random.rand(*d)
dd = dd.reshape((1,)+dd.shape)
data.append(dd)
input_img = Input(shape=(1, None, 32))
filtersz = 3
pad_it = 'same'
size1 = 16
size2 = 8
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(input_img)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
encoded = MaxPooling2D((2, 2), padding=pad_it)(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(encoded)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size2, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
x = Conv2D(size1, (filtersz, filtersz), activation='relu', padding=pad_it)(x)
x = UpSampling2D((2, 2), data_format="channels_first")(x)
decoded = Conv2D(1, (filtersz, filtersz), activation=outfun, padding=pad_it)(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss= 'binary_crossentropy')
x_train = data[1:]
x_test= [data[0]]
def mygen(xx, *args, **kwargs):
for i in cycle(xx):
ii = i.reshape((1,)+i.shape)
yield ii,ii
thegen = mygen(x_train)
thegenval = mygen(x_test)
hist = autoencoder.fit_generator(
thegen,
epochs=AE_EPOCHS,
steps_per_epoch=3,
verbose=VERB,
validation_data=thegenval,
validation_steps=1
)