如何在 Apple Vision 框架中拍摄检测到的矩形的照片

Question

如何从成功的 VNRectangleObservation 对象中拍照（获取 CIImage）？

我有一个视频捕获会话运行并在 func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) 中进行处理，即

func captureOutput(_ output: AVCaptureOutput,
                   didOutput sampleBuffer: CMSampleBuffer,
                   from connection: AVCaptureConnection) {
    guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

    do {
        try handler.perform([request], on: pixelBuffer)
    } catch {
        print(error)
    }
}

我是否应该将传递给处理程序的像素缓冲区保存在某个地方并在该缓冲区上进行操作？很遗憾我无法从观察对象访问图像作为属性 :(

有什么想法吗？

Answer 1

所以您正在使用产生 VNRectangleObservations 的视觉请求，并且您想要提取由这些观察结果识别的主题图像区域？也许也对它们进行透视投影，使它们在图像平面上呈矩形？（有这个 in the Vision session from WWDC17 的演示。）

您可以使用 Core Image 中的 CIPerspectiveCorrection 滤镜提取和校正该区域。要进行设置，您需要传递图像观察中的点，并将其转换为像素坐标。看起来像这样：

func extractPerspectiveRect(_ observation: VNRectangleObservation, from buffer: CVImageBuffer) -> CIImage {
    // get the pixel buffer into Core Image
    let ciImage = CIImage(cvImageBuffer: buffer)

    // convert corners from normalized image coordinates to pixel coordinates
    let topLeft = observation.topLeft.scaled(to: ciImage.extent.size)
    let topRight = observation.topRight.scaled(to: ciImage.extent.size)
    let bottomLeft = observation.bottomLeft.scaled(to: ciImage.extent.size)
    let bottomRight = observation.bottomRight.scaled(to: ciImage.extent.size)

    // pass those to the filter to extract/rectify the image
    return ciImage.applyingFilter("CIPerspectiveCorrection", parameters: [
        "inputTopLeft": CIVector(cgPoint: topLeft),
        "inputTopRight": CIVector(cgPoint: topRight),
        "inputBottomLeft": CIVector(cgPoint: bottomLeft),
        "inputBottomRight": CIVector(cgPoint: bottomRight),
    ])
}

Aside: The scaled function above is a convenience extension on CGPoint to make coordinate math a bit smaller at the call site:
extension CGPoint {
   func scaled(to size: CGSize) -> CGPoint {
       return CGPoint(x: self.x * size.width,
                      y: self.y * size.height)
   }
}

现在，这会得到一个 CIImage object — those aren't really displayable images themselves, just instructions for how to process and display an image, something that can be done in many different possible ways. Many ways to display an image involve CIContext — you can have it render out into another pixel buffer, or maybe a Metal texture if you're trying to do this processing in real-time — but not all. On the other hand, if you're just displaying static images less frequently, you can create a UIImage directly from the CIImage 并将其显示在 UIImageView 中，UIKit 将管理底层 CIContext 和渲染过程。

如何在 Apple Vision 框架中拍摄检测到的矩形的照片

How can I take a photo of a detected rectangle in Apple Vision framework

rectangles

computer-vision

ios

apple-vision