Tensorflow Object Detection API配置文件中“keep_aspect_ratio_resizer {”的作用是什么?

What's the function of “keep_aspect_ratio_resizer {” in the config file of Tensorflow Object Detection API?

我使用 Tensorflow 对象检测 API 为 Faster-RCNN 创建 AI。 GitHub:Tensorflow/models

配置文件中的“keep_aspect_ratio_resizer{”有什么样的缩放功能?

我准备了 1920 x 1080 像素的图像,并将配置文件中“keep_aspect_ratio_resizer {”后面描述的“最小尺寸:”和“最大尺寸:”分别设置为 768。

在这种情况下,1920x1080 像素的图像将被调整为 768x768 像素并输入到 CNN。这时,图像的原始比例(16:9)是否会保持不变?即,当图像调整为 768x768 像素时,图像的长边是否会转换为 768 像素并在图像的边缘添加黑条?

或者图片比例从16:9变成1:1变扭曲了?

如果有人知道这件事,请告诉我。

谢谢!

配置文件不同字段的定义如下link:https://github.com/tensorflow/models/tree/master/research/object_detection/protos

keep_aspect_ratio_resizer 字段在 image_resizer.proto 中并说明以下内容:

// Configuration proto for image resizer that keeps aspect ratio.
message KeepAspectRatioResizer {
  // Desired size of the smaller image dimension in pixels.
  optional int32 min_dimension = 1 [default = 600];

  // Desired size of the larger image dimension in pixels.
  optional int32 max_dimension = 2 [default = 1024];

  // Desired method when resizing image.
  optional ResizeType resize_method = 3 [default = BILINEAR];

  // Whether to pad the image with zeros so the output spatial size is
  // [max_dimension, max_dimension]. Note that the zeros are padded to the
  // bottom and the right of the resized image.
  optional bool pad_to_max_dimension = 4 [default = false];

  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
  optional bool convert_to_grayscale = 5 [default = false];

  // Per-channel pad value. This is only used when pad_to_max_dimension is True.
  // If unspecified, a default pad value of 0 is applied to all channels.
  repeated float per_channel_pad_value = 6;
}

因此,您可以选择通过在配置文件中添加 pad_to_max_dimension: true 来添加填充(黑条)。否则它应该保持纵横比。