AVCaptureVideoPreviewLayer 不检测屏幕两个范围内的对象

Question

我下载了 Apple 关于在 Live Capture 中识别对象的项目。当我试用该应用程序时，我发现如果我将要识别的对象放在相机视图的顶部或底部，该应用程序将无法识别该对象：

在第一张图片中，香蕉位于相机视图的中央，应用程序能够识别它。

image object in center

在这两张图片中，香蕉靠近相机视图的边界，无法识别物体。

image object on top

image object on bottom

session 和 previewLayer 是这样设置的：

 func setupAVCapture() {
    var deviceInput: AVCaptureDeviceInput!

    // Select a video device, make an input
    let videoDevice = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: .video, position: .back).devices.first
    do {
        deviceInput = try AVCaptureDeviceInput(device: videoDevice!)
    } catch {
        print("Could not create video device input: \(error)")
        return
    }

    session.beginConfiguration()
    session.sessionPreset = .vga640x480 // Model image size is smaller.

    // Add a video input
    guard session.canAddInput(deviceInput) else {
        print("Could not add video device input to the session")
        session.commitConfiguration()
        return
    }
    session.addInput(deviceInput)
    if session.canAddOutput(videoDataOutput) {
        session.addOutput(videoDataOutput)
        // Add a video data output
        videoDataOutput.alwaysDiscardsLateVideoFrames = true
        videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)]
        videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue)
    } else {
        print("Could not add video data output to the session")
        session.commitConfiguration()
        return
    }
    let captureConnection = videoDataOutput.connection(with: .video)
    // Always process the frames
    captureConnection?.isEnabled = true
    do {
        try  videoDevice!.lockForConfiguration()
        let dimensions = CMVideoFormatDescriptionGetDimensions((videoDevice?.activeFormat.formatDescription)!)
        bufferSize.width = CGFloat(dimensions.width)
        bufferSize.height = CGFloat(dimensions.height)
        videoDevice!.unlockForConfiguration()
    } catch {
        print(error)
    }
    session.commitConfiguration()
    previewLayer = AVCaptureVideoPreviewLayer(session: session)
    previewLayer.videoGravity = AVLayerVideoGravity.resizeAspectFill
    rootLayer = previewView.layer
    previewLayer.frame = rootLayer.bounds
    rootLayer.addSublayer(previewLayer)
}

您可以下载项目here，我想知道这是否正常。

有什么解决办法吗？拍正方形的照片用coreml详细说明不包括这两个范围吗？有什么提示吗？谢谢

Answer 1

这可能是因为 imageCropAndScaleOption 设置为 centerCrop。

Core ML 模型需要方形图像，但视频帧不是方形的。这可以通过在 VNCoreMLRequest 上设置 imageCropAndScaleOption 选项来解决。但是，结果可能不如中心裁剪（这取决于模型最初的训练方式）。

另请参阅 Apple 文档中的 VNImageCropAndScaleOption。

AVCaptureVideoPreviewLayer 不检测屏幕两个范围内的对象

AVCaptureVideoPreviewLayer does not detect objects in two ranges of the screen

xcode

avcapturesession

swift

apple-vision

coreml