Tensorflow U-Net 分割掩码输入
Tensorflow U-Net segmentation mask input
我是张量流和语义分割的新手。
我正在设计一个用于语义分割的 U-Net。每张图片都有一个我想要 class 化的对象。但我总共有 10 个不同物体的图像。我很困惑,如何准备掩码输入?它被认为是多标签分割还是仅针对一个 class?
我应该将我的输入转换为一种热编码吗?我应该使用 to_categorical 吗?我找到了 multi-class 分割的例子,但我不知道,如果是这样的话。因为在一张图片中我只有一个对象 detect/classify。
我试着用这个作为我的输入代码。但是我不确定,我做的对不对。
#Generation of batches of image and mask
class DataGen(keras.utils.Sequence):
def __init__(self, image_names, path, batch_size, image_size=128):
self.image_names = image_names
self.path = path
self.batch_size = batch_size
self.image_size = image_size
def __load__(self, image_name):
# Path
image_path = os.path.join(self.path, "images/aug_test", image_name) + ".png"
mask_path = os.path.join(self.path, "masks/aug_test",image_name) + ".png"
# Reading Image
image = cv2.imread(image_path, 1)
image = cv2.resize(image, (self.image_size, self.image_size))
# Reading Mask
mask = cv2.imread(mask_path, -1)
mask = cv2.resize(mask, (self.image_size, self.image_size))
## Normalizaing
image = image/255.0
mask = mask/255.0
return image, mask
def __getitem__(self, index):
if(index+1)*self.batch_size > len(self.image_names):
self.batch_size = len(self.image_names) - index*self.batch_size
image_batch = self.image_names[index*self.batch_size : (index+1)*self.batch_size]
image = []
mask = []
for image_name in image_batch:
_img, _mask = self.__load__(image_name)
image.append(_img)
mask.append(_mask)
#This is where I am defining my input
image = np.array(image)
mask = np.array(mask)
mask = tf.keras.utils.to_categorical(mask, num_classes=10, dtype='float32') #Is this true?
return image, mask
def __len__(self):
return int(np.ceil(len(self.image_names)/float(self.batch_size)))
这是真的吗?如果是,那么,要获得 label/class 作为输出,我应该在输入中更改什么?我应该根据我的 class 改变我的面具的像素值吗?
这是我的 U-Net 架构。
# Convolution and deconvolution Blocks
def down_scaling_block(x, filters, kernel_size=(3, 3), padding="same", strides=1):
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(x)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
pool = keras.layers.MaxPool2D((2, 2), (2, 2))(conv)
return conv, pool
def up_scaling_block(x, skip, filters, kernel_size=(3, 3), padding="same", strides=1):
conv_t = keras.layers.UpSampling2D((2, 2))(x)
concat = keras.layers.Concatenate()([conv_t, skip])
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(concat)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
return conv
def bottleneck(x, filters, kernel_size=(3, 3), padding="same", strides=1):
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(x)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
return conv
def UNet():
filters = [16, 32, 64, 128, 256]
inputs = keras.layers.Input((image_size, image_size, 3))
'''inputs2 = keras.layers.Input((image_size, image_size, 1))
conv1_2, pool1_2 = down_scaling_block(inputs2, filters[0])'''
Input = inputs
conv1, pool1 = down_scaling_block(Input, filters[0])
conv2, pool2 = down_scaling_block(pool1, filters[1])
conv3, pool3 = down_scaling_block(pool2, filters[2])
'''conv3 = keras.layers.Conv2D(filters[2], kernel_size=(3,3), padding="same", strides=1, activation="relu")(pool2)
conv3 = keras.layers.Conv2D(filters[2], kernel_size=(3,3), padding="same", strides=1, activation="relu")(conv3)
drop3 = keras.layers.Dropout(0.5)(conv3)
pool3 = keras.layers.MaxPooling2D((2,2), (2,2))(drop3)'''
conv4, pool4 = down_scaling_block(pool3, filters[3])
bn = bottleneck(pool4, filters[4])
deConv1 = up_scaling_block(bn, conv4, filters[3]) #8 -> 16
deConv2 = up_scaling_block(deConv1, conv3, filters[2]) #16 -> 32
deConv3 = up_scaling_block(deConv2, conv2, filters[1]) #32 -> 64
deConv4 = up_scaling_block(deConv3, conv1, filters[0]) #64 -> 128
outputs = keras.layers.Conv2D(10, (1, 1), padding="same", activation="softmax")(deConv4)
model = keras.models.Model(inputs, outputs)
return model
model = UNet()
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["acc"])
train_gen = DataGen(train_img, train_path, image_size=image_size, batch_size=batch_size)
valid_gen = DataGen(valid_img, train_path, image_size=image_size, batch_size=batch_size)
test_gen = DataGen(test_img, test_path, image_size=image_size, batch_size=batch_size)
train_steps = len(train_img)//batch_size
valid_steps = len(valid_img)//batch_size
model.fit_generator(train_gen, validation_data=valid_gen, steps_per_epoch=train_steps, validation_steps=valid_steps,
epochs=epochs)
我希望我正确解释了我的问题。任何帮助appriciated!
更新:我根据对象 class 更改了蒙版中每个像素的值。 (如果图像包含我想 class 确定为对象编号 2 的对象,那么我将掩码像素的值更改为 2。整个掩码数组将包含 0(bg) 和 2(object)。因此,对于每个对象,掩码将包含 0 和 3、0 和 10 等)
这里我先把mask改成了binary,然后如果pixel的值大于1,就改成1或者2或者3。(根据object/class号)
然后我用 to_categorical 将它们转换为 one_hot,如我的代码所示。训练运行但网络没有学到任何东西。准确性和损失在两个值之间不断摆动。我的错误是什么?我是不是在生成掩码(更改像素值?)或函数 to_categorical?
时犯了错误
发现问题:
我在创建蒙版时出错了。我正在使用 cv2 读取图像,它将图像读取为 heightxwidth。我正在根据 class 创建具有像素值的蒙版,在将我的图像尺寸视为 widthxheight 之后。这是造成问题并使网络无法学习任何东西..它现在正在工作..
Each image has one object that I want to classify. But in total I have images of 10 different objects. I am confused, how can I prepare my mask input? Is it considered as multi-label segmentation or only for one class?
如果你的数据集有 N 个不同的标签(即:0 - 背景,1 - 狗,2 - 猫...),你有一个多重 class 问题,即使你的图像只包含一种对象。
Should I convert my input to one hot encoded? Should I use to_categorical?
是的,你应该一次性编码你的标签。使用 to_categorical 归结为标签的源格式。假设你有 N classes 并且你的标签是 (height, width, 1),其中每个像素的值都在 [0,N] 范围内。在那种情况下 keras.utils.to_categorical(label, N) 将提供一个浮点数 (height,width,N) 标签,其中每个像素为 0 或 1。而你没有除以 255.
如果您的源格式不同,您可能必须使用自定义函数来获得相同的输出格式。
查看此存储库(不是我的作品):keras-unet。 notebooks 文件夹包含两个示例,用于在小型数据集上训练 u-net。它们不是multiclass,但是很容易一步一步去使用你自己的数据集。将您的标签加载为星标:
im = Image.open(mask).resize((512,512))
im = to_categorical(im,NCLASSES)
像这样重塑和归一化:
x = np.asarray(imgs_np, dtype=np.float32)/255
y = np.asarray(masks_np, dtype=np.float32)
y = y.reshape(y.shape[0], y.shape[1], y.shape[2], NCLASSES)
x = x.reshape(x.shape[0], x.shape[1], x.shape[2], 3)
使您的模型适应 NCLASSES
model = custom_unet(
input_shape,
use_batch_norm=False,
num_classes=NCLASSES,
filters=64,
dropout=0.2,
output_activation='softmax')
select正确的损失:
from keras.losses import categorical_crossentropy
model.compile(
optimizer=SGD(lr=0.01, momentum=0.99),
loss='categorical_crossentropy',
metrics=[iou, iou_thresholded])
希望对您有所帮助
我是张量流和语义分割的新手。
我正在设计一个用于语义分割的 U-Net。每张图片都有一个我想要 class 化的对象。但我总共有 10 个不同物体的图像。我很困惑,如何准备掩码输入?它被认为是多标签分割还是仅针对一个 class?
我应该将我的输入转换为一种热编码吗?我应该使用 to_categorical 吗?我找到了 multi-class 分割的例子,但我不知道,如果是这样的话。因为在一张图片中我只有一个对象 detect/classify。
我试着用这个作为我的输入代码。但是我不确定,我做的对不对。
#Generation of batches of image and mask
class DataGen(keras.utils.Sequence):
def __init__(self, image_names, path, batch_size, image_size=128):
self.image_names = image_names
self.path = path
self.batch_size = batch_size
self.image_size = image_size
def __load__(self, image_name):
# Path
image_path = os.path.join(self.path, "images/aug_test", image_name) + ".png"
mask_path = os.path.join(self.path, "masks/aug_test",image_name) + ".png"
# Reading Image
image = cv2.imread(image_path, 1)
image = cv2.resize(image, (self.image_size, self.image_size))
# Reading Mask
mask = cv2.imread(mask_path, -1)
mask = cv2.resize(mask, (self.image_size, self.image_size))
## Normalizaing
image = image/255.0
mask = mask/255.0
return image, mask
def __getitem__(self, index):
if(index+1)*self.batch_size > len(self.image_names):
self.batch_size = len(self.image_names) - index*self.batch_size
image_batch = self.image_names[index*self.batch_size : (index+1)*self.batch_size]
image = []
mask = []
for image_name in image_batch:
_img, _mask = self.__load__(image_name)
image.append(_img)
mask.append(_mask)
#This is where I am defining my input
image = np.array(image)
mask = np.array(mask)
mask = tf.keras.utils.to_categorical(mask, num_classes=10, dtype='float32') #Is this true?
return image, mask
def __len__(self):
return int(np.ceil(len(self.image_names)/float(self.batch_size)))
这是真的吗?如果是,那么,要获得 label/class 作为输出,我应该在输入中更改什么?我应该根据我的 class 改变我的面具的像素值吗?
这是我的 U-Net 架构。
# Convolution and deconvolution Blocks
def down_scaling_block(x, filters, kernel_size=(3, 3), padding="same", strides=1):
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(x)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
pool = keras.layers.MaxPool2D((2, 2), (2, 2))(conv)
return conv, pool
def up_scaling_block(x, skip, filters, kernel_size=(3, 3), padding="same", strides=1):
conv_t = keras.layers.UpSampling2D((2, 2))(x)
concat = keras.layers.Concatenate()([conv_t, skip])
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(concat)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
return conv
def bottleneck(x, filters, kernel_size=(3, 3), padding="same", strides=1):
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(x)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
return conv
def UNet():
filters = [16, 32, 64, 128, 256]
inputs = keras.layers.Input((image_size, image_size, 3))
'''inputs2 = keras.layers.Input((image_size, image_size, 1))
conv1_2, pool1_2 = down_scaling_block(inputs2, filters[0])'''
Input = inputs
conv1, pool1 = down_scaling_block(Input, filters[0])
conv2, pool2 = down_scaling_block(pool1, filters[1])
conv3, pool3 = down_scaling_block(pool2, filters[2])
'''conv3 = keras.layers.Conv2D(filters[2], kernel_size=(3,3), padding="same", strides=1, activation="relu")(pool2)
conv3 = keras.layers.Conv2D(filters[2], kernel_size=(3,3), padding="same", strides=1, activation="relu")(conv3)
drop3 = keras.layers.Dropout(0.5)(conv3)
pool3 = keras.layers.MaxPooling2D((2,2), (2,2))(drop3)'''
conv4, pool4 = down_scaling_block(pool3, filters[3])
bn = bottleneck(pool4, filters[4])
deConv1 = up_scaling_block(bn, conv4, filters[3]) #8 -> 16
deConv2 = up_scaling_block(deConv1, conv3, filters[2]) #16 -> 32
deConv3 = up_scaling_block(deConv2, conv2, filters[1]) #32 -> 64
deConv4 = up_scaling_block(deConv3, conv1, filters[0]) #64 -> 128
outputs = keras.layers.Conv2D(10, (1, 1), padding="same", activation="softmax")(deConv4)
model = keras.models.Model(inputs, outputs)
return model
model = UNet()
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["acc"])
train_gen = DataGen(train_img, train_path, image_size=image_size, batch_size=batch_size)
valid_gen = DataGen(valid_img, train_path, image_size=image_size, batch_size=batch_size)
test_gen = DataGen(test_img, test_path, image_size=image_size, batch_size=batch_size)
train_steps = len(train_img)//batch_size
valid_steps = len(valid_img)//batch_size
model.fit_generator(train_gen, validation_data=valid_gen, steps_per_epoch=train_steps, validation_steps=valid_steps,
epochs=epochs)
我希望我正确解释了我的问题。任何帮助appriciated!
更新:我根据对象 class 更改了蒙版中每个像素的值。 (如果图像包含我想 class 确定为对象编号 2 的对象,那么我将掩码像素的值更改为 2。整个掩码数组将包含 0(bg) 和 2(object)。因此,对于每个对象,掩码将包含 0 和 3、0 和 10 等)
这里我先把mask改成了binary,然后如果pixel的值大于1,就改成1或者2或者3。(根据object/class号)
然后我用 to_categorical 将它们转换为 one_hot,如我的代码所示。训练运行但网络没有学到任何东西。准确性和损失在两个值之间不断摆动。我的错误是什么?我是不是在生成掩码(更改像素值?)或函数 to_categorical?
时犯了错误发现问题: 我在创建蒙版时出错了。我正在使用 cv2 读取图像,它将图像读取为 heightxwidth。我正在根据 class 创建具有像素值的蒙版,在将我的图像尺寸视为 widthxheight 之后。这是造成问题并使网络无法学习任何东西..它现在正在工作..
Each image has one object that I want to classify. But in total I have images of 10 different objects. I am confused, how can I prepare my mask input? Is it considered as multi-label segmentation or only for one class?
如果你的数据集有 N 个不同的标签(即:0 - 背景,1 - 狗,2 - 猫...),你有一个多重 class 问题,即使你的图像只包含一种对象。
Should I convert my input to one hot encoded? Should I use to_categorical?
是的,你应该一次性编码你的标签。使用 to_categorical 归结为标签的源格式。假设你有 N classes 并且你的标签是 (height, width, 1),其中每个像素的值都在 [0,N] 范围内。在那种情况下 keras.utils.to_categorical(label, N) 将提供一个浮点数 (height,width,N) 标签,其中每个像素为 0 或 1。而你没有除以 255.
如果您的源格式不同,您可能必须使用自定义函数来获得相同的输出格式。
查看此存储库(不是我的作品):keras-unet。 notebooks 文件夹包含两个示例,用于在小型数据集上训练 u-net。它们不是multiclass,但是很容易一步一步去使用你自己的数据集。将您的标签加载为星标:
im = Image.open(mask).resize((512,512))
im = to_categorical(im,NCLASSES)
像这样重塑和归一化:
x = np.asarray(imgs_np, dtype=np.float32)/255
y = np.asarray(masks_np, dtype=np.float32)
y = y.reshape(y.shape[0], y.shape[1], y.shape[2], NCLASSES)
x = x.reshape(x.shape[0], x.shape[1], x.shape[2], 3)
使您的模型适应 NCLASSES
model = custom_unet(
input_shape,
use_batch_norm=False,
num_classes=NCLASSES,
filters=64,
dropout=0.2,
output_activation='softmax')
select正确的损失:
from keras.losses import categorical_crossentropy
model.compile(
optimizer=SGD(lr=0.01, momentum=0.99),
loss='categorical_crossentropy',
metrics=[iou, iou_thresholded])
希望对您有所帮助