Tensorflow Object Detection API配置文件中“keep_aspect_ratio_resizer {”的作用是什么？

Question

我使用 Tensorflow 对象检测 API 为 Faster-RCNN 创建 AI。 GitHub:Tensorflow/models

配置文件中的“keep_aspect_ratio_resizer{”有什么样的缩放功能？

我准备了 1920 x 1080 像素的图像，并将配置文件中“keep_aspect_ratio_resizer {”后面描述的“最小尺寸：”和“最大尺寸：”分别设置为 768。

在这种情况下，1920x1080 像素的图像将被调整为 768x768 像素并输入到 CNN。这时，图像的原始比例（16:9）是否会保持不变？即，当图像调整为 768x768 像素时，图像的长边是否会转换为 768 像素并在图像的边缘添加黑条？

或者图片比例从16:9变成1:1变扭曲了？

如果有人知道这件事，请告诉我。

谢谢！

Answer 1

配置文件不同字段的定义如下link：https://github.com/tensorflow/models/tree/master/research/object_detection/protos

keep_aspect_ratio_resizer 字段在 image_resizer.proto 中并说明以下内容：

// Configuration proto for image resizer that keeps aspect ratio.
message KeepAspectRatioResizer {
  // Desired size of the smaller image dimension in pixels.
  optional int32 min_dimension = 1 [default = 600];

  // Desired size of the larger image dimension in pixels.
  optional int32 max_dimension = 2 [default = 1024];

  // Desired method when resizing image.
  optional ResizeType resize_method = 3 [default = BILINEAR];

  // Whether to pad the image with zeros so the output spatial size is
  // [max_dimension, max_dimension]. Note that the zeros are padded to the
  // bottom and the right of the resized image.
  optional bool pad_to_max_dimension = 4 [default = false];

  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
  optional bool convert_to_grayscale = 5 [default = false];

  // Per-channel pad value. This is only used when pad_to_max_dimension is True.
  // If unspecified, a default pad value of 0 is applied to all channels.
  repeated float per_channel_pad_value = 6;
}

因此，您可以选择通过在配置文件中添加 pad_to_max_dimension: true 来添加填充（黑条）。否则它应该保持纵横比。

Tensorflow Object Detection API配置文件中“keep_aspect_ratio_resizer {”的作用是什么？

What's the function of “keep_aspect_ratio_resizer {” in the config file of Tensorflow Object Detection API?

python

object-detection

tensorflow

object-detection-api