Tensorflow 对象检测中的裁剪框并将其显示为 jpg 图像

Crop Boxes in Tensorflow Object Detection and display it as jpg image

我正在使用 tensorflow 异议检测来检测护照上的特定数据,例如全名和其他内容。我已经对数据进行了训练,一切正常。它使用边界框完美识别周围的数据。但是,现在我只想裁剪检测到的框。

代码:

import os
import cv2
import numpy as np
import tensorflow as tf
import sys

sys.path.append("..")

from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util

MODEL_NAME = 'inference_graph'

CWD_PATH = os.getcwd()

PATH_TO_CKPT = 'C:/Users/UI UX/Desktop/Captcha 3/CAPTCHA_frozen_inference_graph.pb'

PATH_TO_LABELS = 'C:/Users/UI UX/Desktop/Captcha 3/CAPTCHA_labelmap.pbtxt'

PATH_TO_IMAGE = 'C:/Users/UI UX/Desktop/(47).jpg'

NUM_CLASSES = 11

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

    sess = tf.Session(graph=detection_graph)

image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')

detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')

num_detections = detection_graph.get_tensor_by_name('num_detections:0')

image = cv2.imread(PATH_TO_IMAGE)

image_np = cv2.resize(image, (0, 0), fx=2.0, fy=2.0)

image_expanded = np.expand_dims(image_np, axis=0)

(boxes, scores, classes, num) = sess.run(
    [detection_boxes, detection_scores, detection_classes, num_detections],
    feed_dict={image_tensor: image_expanded})

vis_util.visualize_boxes_and_labels_on_image_array(
    image_np,
    np.squeeze(boxes),
    np.squeeze(classes).astype(np.int32),
    np.squeeze(scores),
    category_index,
    use_normalized_coordinates=True,
    line_thickness=2,
    min_score_thresh=0.60)

width, height = image_np.shape[:2]
for i, box in enumerate(np.squeeze(boxes)):
      if(np.squeeze(scores)[i] > 0.80):
        (ymin, xmin, ymax, xmax) = (box[0]*height, box[1]*width, box[2]*height, box[3]*width)
        cropped_image = tf.image.crop_to_bounding_box(image_np, ymin, xmin, ymax - ymin, xmax - xmin)
        cv2.imshow('cropped_image', image_np)
        cv2.waitKey(0)

cv2.imshow('Object detector', image_np)

cv2.waitKey(0)

cv2.destroyAllWindows()

但出现此错误:

回溯(最后一次调用): 文件“C:/Users/UI UX/PycharmProjects/pythonProject1/vedio_object_detection.py”,第 71 行,位于 cropped_image = tf.image.crop_to_bounding_box(image_np, ymin, xmin, ymax - ymin, xmax - xmin) 文件“C:\ProgramData\Anaconda2\envs\tf_cpu\lib\site-packages\tensorflow_core\python\ops\image_ops_impl.py”,第 875 行,在 crop_to_bounding_box array_ops.stack([-1, target_height, target_width, -1])) 切片中的文件“C:\ProgramData\Anaconda2\envs\tf_cpu\lib\site-packages\tensorflow_core\python\ops\array_ops.py”,第 855 行 return gen_array_ops.切片(输入,开始,大小,名称=名称) 文件“C:\ProgramData\Anaconda2\envs\tf_cpu\lib\site-packages\tensorflow_core\python\ops\gen_array_ops.py”,第 9222 行,在 _slice 中 "切片", input=input, begin=begin, size=size, name=name) 文件“C:\ProgramData\Anaconda2\envs\tf_cpu\lib\site-packages\tensorflow_core\python\framework\op_def_library.py”,第 632 行,在 _apply_op_helper param_name=input_name) _SatisfiesTypeConstraint 中的文件“C:\ProgramData\Anaconda2\envs\tf_cpu\lib\site-packages\tensorflow_core\python\framework\op_def_library.py”,第 61 行 ", ".join(dtypes.as_dtype(x).name for x in allowed_list))) 类型错误:传递给参数 'begin' 的值的数据类型 float32 不在允许值列表中:int32、int64

有什么帮助吗?

我通过在这一行末尾添加这段代码找到了解决方案:

(boxes, scores, classes, num) = sess.run([detection_boxes, detection_scores, detection_classes, num_detections],feed_dict={image_tensor: image_expanded})

我补充一下:

(frame_height, frame_width) = image.shape[:2]

for i in range(len(np.squeeze(scores))):
#print(np.squeeze(boxes)[i])
ymin = int((np.squeeze(boxes)[i][0]*frame_height))
xmin = int((np.squeeze(boxes)[i][1]*frame_width))
ymax = int((np.squeeze(boxes)[i][2]*frame_height))
xmax = int((np.squeeze(boxes)[i][3]*frame_width))
cropped_img = image[ymax:ymin,xmax:xmin]
cv2.imwrite(f'/your/path/img_{i}.png', cropped_img)