为什么 deeplab v3+ 模型对图像边界外的像素感到困惑?
Why is the deeplab v3+ model confused about pixels outside image boundary?
我在我的数据集上使用 google 研究 github 存储库到 运行 deeplab v3+ 来分割汽车的各个部分。我使用的裁剪尺寸是 513,513(默认值),代码为小于该尺寸的图像添加了边界(如果我错了请纠正我)。
该模型似乎在添加的边界上表现不佳。有什么我应该纠正的吗?或者模型会在更多训练后表现良好吗?
更新:Here's 用于训练的张量板图。为什么正则化损失是这样的?输出似乎在改善,谁能帮我从这些图中做出推论?
Is there something I'm supposed to correct or will the model do fine with more training ?
还行,不介意边界
要推断你可以使用这个代码
import cv2
import tensorflow as tf
import numpy as np
from PIL import Image
from skimage.transform import resize
class DeepLabModel():
"""Class to load deeplab model and run inference."""
INPUT_TENSOR_NAME = 'ImageTensor:0'
OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
INPUT_SIZE = 513
def __init__(self, path):
"""Creates and loads pretrained deeplab model."""
self.graph = tf.Graph()
graph_def = None
# Extract frozen graph from tar archive.
with tf.gfile.GFile(path, 'rb')as file_handle:
graph_def = tf.GraphDef.FromString(file_handle.read())
if graph_def is None:
raise RuntimeError('Cannot find inference graph')
with self.graph.as_default():
tf.import_graph_def(graph_def, name='')
self.sess = tf.Session(graph=self.graph)
def run(self, image):
"""Runs inference on a single image.
Args:
image: A PIL.Image object, raw input image.
Returns:
seg_map: np.array. values of pixels are classes
"""
width, height = image.size
resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
target_size = (int(resize_ratio * width), int(resize_ratio * height))
resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
batch_seg_map = self.sess.run(
self.OUTPUT_TENSOR_NAME,
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
seg_map = batch_seg_map[0]
seg_map = resize(seg_map.astype(np.uint8), (height, width), preserve_range=True, order=0, anti_aliasing=False)
return seg_map
代码基于此文件https://github.com/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb
model = DeepLabModel(your_model_pb_path)
img = Image.open(img_path)
seg_map = model.run(img)
要获得 your_model_pb_path
您需要将模型导出到 .pb 文件
你可以使用 Deeplab repo 中的 export_model.py
文件来完成
https://github.com/tensorflow/models/blob/master/research/deeplab/export_model.py
如果你正在训练 xception_65
版本
python3 <path to your deeplab folder>/export_model.py \
--logtostderr \
--checkpoint_path=<your ckpt> \
--export_path="./my_model.pb" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--num_classes=<NUMBER OF YOUR CLASSES> \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
<your ckpt>
是经过训练的模型检查点的路径,您可以在训练时作为参数 --train_logdir
传递的文件夹中找到检查点
您只需要在路径中包含模型名称和迭代次数,或者换句话说,您将在训练文件夹中包含文件,例如文件
model-1500.meta
、model-1500.index
和 model-1000.data-00000-of-00001
您需要丢弃 .
之后的所有内容,因此 ckpt 路径将为 model-1000
请确保 atrous_rates
与您用于训练模型的相同
如果你正在训练 mobilenet_v2
版本
python3 <path to your deeplab folder>/export_model.py \
--logtostderr \
--checkpoint_path=<your ckpt> \
--export_path="./my_model.pb" \
--model_variant="mobilenet_v2" \
--num_classes=<NUMBER OF YOUR CLASSES> \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
您可以在这里找到更多
https://github.com/tensorflow/models/blob/master/research/deeplab/local_test_mobilenetv2.sh
https://github.com/tensorflow/models/blob/master/research/deeplab/local_test.sh
您可以使用此代码可视化结果
img_arr = np.array(img)
# as may colors as you have classes
colors = [(255, 0, 0), (0, 255, 0), ...]
for c in range(0, N_CLASSES):
img_arr[seg_map == c] = 0.5 * img_arr[seg_map == c] + 0.5 * np.array(colors[c])
cv2.imshow(img_arr)
cv2.waitKey(0)
我在我的数据集上使用 google 研究 github 存储库到 运行 deeplab v3+ 来分割汽车的各个部分。我使用的裁剪尺寸是 513,513(默认值),代码为小于该尺寸的图像添加了边界(如果我错了请纠正我)。
该模型似乎在添加的边界上表现不佳。有什么我应该纠正的吗?或者模型会在更多训练后表现良好吗?
更新:Here's 用于训练的张量板图。为什么正则化损失是这样的?输出似乎在改善,谁能帮我从这些图中做出推论?
Is there something I'm supposed to correct or will the model do fine with more training ?
还行,不介意边界
要推断你可以使用这个代码
import cv2
import tensorflow as tf
import numpy as np
from PIL import Image
from skimage.transform import resize
class DeepLabModel():
"""Class to load deeplab model and run inference."""
INPUT_TENSOR_NAME = 'ImageTensor:0'
OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
INPUT_SIZE = 513
def __init__(self, path):
"""Creates and loads pretrained deeplab model."""
self.graph = tf.Graph()
graph_def = None
# Extract frozen graph from tar archive.
with tf.gfile.GFile(path, 'rb')as file_handle:
graph_def = tf.GraphDef.FromString(file_handle.read())
if graph_def is None:
raise RuntimeError('Cannot find inference graph')
with self.graph.as_default():
tf.import_graph_def(graph_def, name='')
self.sess = tf.Session(graph=self.graph)
def run(self, image):
"""Runs inference on a single image.
Args:
image: A PIL.Image object, raw input image.
Returns:
seg_map: np.array. values of pixels are classes
"""
width, height = image.size
resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
target_size = (int(resize_ratio * width), int(resize_ratio * height))
resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
batch_seg_map = self.sess.run(
self.OUTPUT_TENSOR_NAME,
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
seg_map = batch_seg_map[0]
seg_map = resize(seg_map.astype(np.uint8), (height, width), preserve_range=True, order=0, anti_aliasing=False)
return seg_map
代码基于此文件https://github.com/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb
model = DeepLabModel(your_model_pb_path)
img = Image.open(img_path)
seg_map = model.run(img)
要获得 your_model_pb_path
您需要将模型导出到 .pb 文件
你可以使用 Deeplab repo 中的 export_model.py
文件来完成
https://github.com/tensorflow/models/blob/master/research/deeplab/export_model.py
如果你正在训练 xception_65
版本
python3 <path to your deeplab folder>/export_model.py \
--logtostderr \
--checkpoint_path=<your ckpt> \
--export_path="./my_model.pb" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--num_classes=<NUMBER OF YOUR CLASSES> \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
<your ckpt>
是经过训练的模型检查点的路径,您可以在训练时作为参数 --train_logdir
传递的文件夹中找到检查点
您只需要在路径中包含模型名称和迭代次数,或者换句话说,您将在训练文件夹中包含文件,例如文件
model-1500.meta
、model-1500.index
和 model-1000.data-00000-of-00001
您需要丢弃 .
之后的所有内容,因此 ckpt 路径将为 model-1000
请确保 atrous_rates
与您用于训练模型的相同
如果你正在训练 mobilenet_v2
版本
python3 <path to your deeplab folder>/export_model.py \
--logtostderr \
--checkpoint_path=<your ckpt> \
--export_path="./my_model.pb" \
--model_variant="mobilenet_v2" \
--num_classes=<NUMBER OF YOUR CLASSES> \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
您可以在这里找到更多 https://github.com/tensorflow/models/blob/master/research/deeplab/local_test_mobilenetv2.sh https://github.com/tensorflow/models/blob/master/research/deeplab/local_test.sh
您可以使用此代码可视化结果
img_arr = np.array(img)
# as may colors as you have classes
colors = [(255, 0, 0), (0, 255, 0), ...]
for c in range(0, N_CLASSES):
img_arr[seg_map == c] = 0.5 * img_arr[seg_map == c] + 0.5 * np.array(colors[c])
cv2.imshow(img_arr)
cv2.waitKey(0)