来自 VNDetectRectangleRequest 的边界框在用作子项时大小不正确 VC

Question

我正在尝试使用 Apple 的 Vision 框架中的 VNDetectRectangleRequest 来自动抓取一张卡片的图片。然而，当我转换点来绘制矩形时，它是畸形的并且不遵循它应该的矩形。我一直在密切关注这个article

一个主要的区别是我将我的 CameraCaptureVC 嵌入到另一个 ViewController 中，这样只有当卡在这个较小的 window.[=18= 中时才会被扫描]

下面是我在父 vc 中设置相机 vc 的方式（从 viewDidLoad 调用）。

func configureSubviews() {
    clearView.addSubview(cameraVC.view)
    cameraVC.view.autoPinEdgesToSuperviewEdges()
    self.addChild(cameraVC)
    cameraVC.didMove(toParent: self)
}

下面是绘制矩形的代码

func createLayer(in rect: CGRect) {
    let maskLayer = CAShapeLayer()
    maskLayer.frame = rect
    maskLayer.cornerRadius = 10
    maskLayer.opacity = 0.75
    maskLayer.borderColor = UIColor.red.cgColor
    maskLayer.borderWidth = 5.0

    previewLayer.insertSublayer(maskLayer, at: 1)
}

func removeMask() {
    if let sublayer = previewLayer.sublayers?.first(where: { [=12=] as? CAShapeLayer != nil }) {
        sublayer.removeFromSuperlayer()
    }
}

func drawBoundingBox(rect : VNRectangleObservation) {
    let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -finalFrame.height)

    let scale = CGAffineTransform.identity.scaledBy(x: finalFrame.width, y: finalFrame.height)

    let bounds = rect.boundingBox.applying(scale).applying(transform)

    createLayer(in: bounds)
}

func detectRectangle(in image: CVPixelBuffer) {
    let request = VNDetectRectanglesRequest { (request: VNRequest, error: Error?) in
        DispatchQueue.main.async {
            guard let results = request.results as? [VNRectangleObservation],
                let rect = results.first else { return }
            self.removeMask()
            self.drawBoundingBox(rect: rect)
        }
    }
    request.minimumAspectRatio = 0.0
    request.maximumAspectRatio = 1.0
    request.maximumObservations = 0
    let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: image, options: [:])
    try? imageRequestHandler.perform([request])
}

这是我的结果。红色矩形应该沿着卡片的边界，但它太短了，原点甚至不在卡片的顶部。

我曾尝试更改 drawBoundingBox 函数中的值，但似乎无济于事。我也试过像下面这样以不同的方式转换边界，但结果是一样的，改变这些值会变得很麻烦。

    let scaledHeight: CGFloat = originalFrame.width / finalFrame.width * finalFrame.height
    let boundingBox = rect.boundingBox
    let x = finalFrame.width * boundingBox.origin.x
    let height = scaledHeight * boundingBox.height
    let y = scaledHeight * (1 - boundingBox.origin.y) - height
    let width = finalFrame.width * boundingBox.width

    let bounds = CGRect(x: x, y: y, width: width, height: height)
    createLayer(in: bounds)

非常感谢任何帮助。也许因为我小时候嵌入它 VC 我需要第二次转换坐标？我尝试过类似的方法但无济于事，但也许我做错了或者遗漏了什么

Answer 1

首先让我们看一下boundingBox，这是一个“规范化”的矩形。苹果说

The coordinates are normalized to the dimensions of the processed image, with the origin at the image's lower-left corner.

这意味着：

origin 在 bottom-left，而不是 top-left
origin.x和width是整个图像宽度的一小部分
origin.y和height是整个图片高度的一小部分

希望这张图更清楚：

What you are used to	What Vision returns

上面的函数将 boundingBox 转换为 finalFrame 的坐标，我假设它是整个视图的框架。那比你的小 CameraCaptureVC.

大得多

此外，您的 CameraCaptureVC 预览层可能具有 aspectFill 视频引力。您还需要考虑显示图像的溢出部分。

试试这个转换函数。

func getConvertedRect(boundingBox: CGRect, inImage imageSize: CGSize, containedIn containerSize: CGSize) -> CGRect {
    
    let rectOfImage: CGRect
    
    let imageAspect = imageSize.width / imageSize.height
    let containerAspect = containerSize.width / containerSize.height
    
    if imageAspect > containerAspect { /// image extends left and right
        let newImageWidth = containerSize.height * imageAspect /// the width of the overflowing image
        let newX = -(newImageWidth - containerSize.width) / 2
        rectOfImage = CGRect(x: newX, y: 0, width: newImageWidth, height: containerSize.height)
        
    } else { /// image extends top and bottom
        let newImageHeight = containerSize.width * (1 / imageAspect) /// the width of the overflowing image
        let newY = -(newImageHeight - containerSize.height) / 2
        rectOfImage = CGRect(x: 0, y: newY, width: containerSize.width, height: newImageHeight)
    }
    
    let newOriginBoundingBox = CGRect(
    x: boundingBox.origin.x,
    y: 1 - boundingBox.origin.y - boundingBox.height,
    width: boundingBox.width,
    height: boundingBox.height
    )
    
    var convertedRect = VNImageRectForNormalizedRect(newOriginBoundingBox, Int(rectOfImage.width), Int(rectOfImage.height))
    
    /// add the margins
    convertedRect.origin.x += rectOfImage.origin.x
    convertedRect.origin.y += rectOfImage.origin.y
    
    return convertedRect
}

这考虑了图像视图的框架以及 aspect fill 内容模式。

示例（为简单起见，我使用的是静态图像而不是实时摄像头源）：

/// inside your Vision request completion handler...
guard let image = self.imageView.image else { return }

let convertedRect = self.getConvertedRect(
    boundingBox: observation.boundingBox,
    inImage: image.size,
    containedIn: self.imageView.bounds.size
)
self.drawBoundingBox(rect: convertedRect)

func drawBoundingBox(rect: CGRect) {
    let uiView = UIView(frame: rect)
    imageView.addSubview(uiView)
        
    uiView.backgroundColor = UIColor.clear
    uiView.layer.borderColor = UIColor.orange.cgColor
    uiView.layer.borderWidth = 3
}

我做了一个示例项目here。

来自 VNDetectRectangleRequest 的边界框在用作子项时大小不正确 VC

Bounding Box from VNDetectRectangleRequest is not correct size when used as child VC

rectangles

ios

swift

cvpixelbuffer

apple-vision