张量流中边界框的坐标
coordinates of bounding box in tensorflow
我想要来自 tensorflow 模型的预测边界框的坐标。
我正在使用来自 here.
的对象检测脚本
在遵循了一些关于 Whosebug 的答案后,我将最后一个检测块修改为
for image_path in TEST_IMAGE_PATHS:
image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
width, height = image.size
print(width,height)
ymin = output_dict['detection_boxes'][5][0]*height
xmin = output_dict['detection_boxes'][5][1]*width
ymax = output_dict['detection_boxes'][5][2]*height
xmax = output_dict['detection_boxes'][5][3]*width
#print(output_dict['detection_boxes'][0])
print (xmin,ymin)
print (xmax,ymax)
然而 output_dict['detection_boxes'].
中有 100 个元组
即使对于那些未能预测的图像,也有 100 个元组
我要的是单张图片所有边界框的坐标
如果您检查所用模型的 pipeline.config 文件,您会发现在某些地方最大框数设置为 100。
例如,在 config file of ssd_mobilenet_v1 中,这是演示笔记本中的模型,您可以在
下看到它
post_processing {
batch_non_max_suppression {
...
max_detections_per_class: 100
max_total_detections: 100
}
}
这也是输入阅读器的默认设置(对于训练和评估),您可以更改它们,但这仅与您 training/evaluating 相关。如果您想在不重新训练模型的情况下进行推理,您可以简单地使用预训练模型(同样,例如 ssd_mobilenet_v1),然后在使用 --config_override
参数时自己 exporting 它为了改变我上面在NMS中提到的值。
在 expand_dims 行之后,您可以添加这些代码。 filtered_boxes 变量将给出预测值大于 0.5 的边界框。
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
indexes = []
import os
for i in range (classes.size):
if(classes[0][i] in range(1,91) and scores[0][i]>0.5):
indexes.append(i)
filtered_boxes = boxes[0][indexes, ...]
filtered_scores = scores[0][indexes, ...]
filtered_classes = classes[0][indexes, ...]
filtered_classes = list(set(filtered_classes))
filtered_classes = [int(i) for i in filtered_classes]
for image_path in TEST_IMAGE_PATHS:
image_np = cv2.imread(image_path)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
#if using cv2 to load image
(im_width, im_height) = image_np.shape[:2]
ymin = output_dict['detection_boxes'][0][0]*im_height
xmin = output_dict['detection_boxes'][0][1]*im_width
ymax = output_dict['detection_boxes'][0][2]*im_height
xmax = output_dict['detection_boxes'][0][3]*im_width
使用上面的代码,您将获得检测到的 class 所需的边界框坐标,该坐标位于第一个方括号指示的第 0 个位置。
我想要来自 tensorflow 模型的预测边界框的坐标。
我正在使用来自 here.
的对象检测脚本
在遵循了一些关于 Whosebug 的答案后,我将最后一个检测块修改为
for image_path in TEST_IMAGE_PATHS:
image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
width, height = image.size
print(width,height)
ymin = output_dict['detection_boxes'][5][0]*height
xmin = output_dict['detection_boxes'][5][1]*width
ymax = output_dict['detection_boxes'][5][2]*height
xmax = output_dict['detection_boxes'][5][3]*width
#print(output_dict['detection_boxes'][0])
print (xmin,ymin)
print (xmax,ymax)
然而 output_dict['detection_boxes'].
中有 100 个元组
即使对于那些未能预测的图像,也有 100 个元组
我要的是单张图片所有边界框的坐标
如果您检查所用模型的 pipeline.config 文件,您会发现在某些地方最大框数设置为 100。 例如,在 config file of ssd_mobilenet_v1 中,这是演示笔记本中的模型,您可以在
下看到它post_processing {
batch_non_max_suppression {
...
max_detections_per_class: 100
max_total_detections: 100
}
}
这也是输入阅读器的默认设置(对于训练和评估),您可以更改它们,但这仅与您 training/evaluating 相关。如果您想在不重新训练模型的情况下进行推理,您可以简单地使用预训练模型(同样,例如 ssd_mobilenet_v1),然后在使用 --config_override
参数时自己 exporting 它为了改变我上面在NMS中提到的值。
在 expand_dims 行之后,您可以添加这些代码。 filtered_boxes 变量将给出预测值大于 0.5 的边界框。
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
indexes = []
import os
for i in range (classes.size):
if(classes[0][i] in range(1,91) and scores[0][i]>0.5):
indexes.append(i)
filtered_boxes = boxes[0][indexes, ...]
filtered_scores = scores[0][indexes, ...]
filtered_classes = classes[0][indexes, ...]
filtered_classes = list(set(filtered_classes))
filtered_classes = [int(i) for i in filtered_classes]
for image_path in TEST_IMAGE_PATHS:
image_np = cv2.imread(image_path)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
#if using cv2 to load image
(im_width, im_height) = image_np.shape[:2]
ymin = output_dict['detection_boxes'][0][0]*im_height
xmin = output_dict['detection_boxes'][0][1]*im_width
ymax = output_dict['detection_boxes'][0][2]*im_height
xmax = output_dict['detection_boxes'][0][3]*im_width
使用上面的代码,您将获得检测到的 class 所需的边界框坐标,该坐标位于第一个方括号指示的第 0 个位置。