如何以最高的置信度跟踪条形码

How to track the barcode with highest confidence

我正在使用视觉框架来检测条形码。我想在实时视频中以最高置信度在条形码周围显示一个矩形,意思是,我想将该矩形跟踪到实时预览中看到的条形码。

所以我有这个代码来检测 roi 内的条形码。

lazy var barcodeRequest: VNDetectBarcodesRequest = {
    let barcodeRequest = VNDetectBarcodesRequest {[weak self] request, error in
      guard error == nil else {
        print ("ERRO: \(error?.localizedDescription ?? "error")")
        return
      }
      self?.resultClassification(request)
    }
    barcodeRequest.regionOfInterest = CGRect(x: 0,
                                             y: 0.3,
                                             width: 1,
                                             height: 0.4)
    return barcodeRequest
  }()

检测到条形码时将触发此方法

func resultClassification(_ request: VNRequest) {
    guard let barcodes = request.results,
          let potentialCodes = barcodes as? [VNBarcodeObservation]
    else { return }
    
    // choose the bar code with highestConfidence
    let highestConfidenceBarcodeDetected = potentialCodes.max(by: {[=11=].confidence < .confidence})
    
    // do something with highestConfidenceBarcodeDetected

    // 1
  }

这是我的问题。

现在我有了置信度最高的条形码,我想在屏幕上跟踪它。所以,我想我必须在 // 1.

添加代码

但在此之前我必须为跟踪器定义这个:

var inputObservation:VNDetectedObjectObservation!


lazy var barcodeTrackingRequest: VNTrackObjectRequest = {
  let barcodeTrackingRequest = VNTrackObjectRequest(detectedObjectObservation: inputObservation) { [weak self] request, error in
    guard error == nil else {
      print("Detection error: \(String(describing: error)).")
      return
    }
    self?.resultClassificationTracker(request)
  }
  return barcodeTrackingRequest
}()

func resultClassificationTracker(_ request:VNRequest) {
  // all I want from this is to store the boundingbox on a var  
}

现在,我如何连接这两段代码,以便在每次获得跟踪器的边界框值时 resultClassificationTracker 触发?

前段时间我做了类似的事情,写了一个article在上面。它适用于 VNRecognizeTextRequest 而不是 VNDetectBarcodesRequest,但它很相似。这就是我所做的:

  • 连续执行VNImageRequestHandler(一旦完成,再次开始)
  • 将检测指示器视图存储在 属性 var previousTrackingView: UIView?
  • 每当请求处理程序完成时,将检测指示器设置为新矩形的动画
  • 使用Core Motion检测设备移动,并调整检测指标的边框

结果如下:

如您所见,height/y 坐标不是很准确。我的猜测是 Vision 只需要一条水平线来扫描条形码——就像杂货店里的激光扫描仪一样——所以它不会 return 全高。但那是另一个问题。

连续执行VNImageRequestHandler(一旦完成,再次开始)

为此,我正在制作一个 属性 busyPerformingVisionRequest,每当它为 false 时,我都会调用 Vision 请求。这是在 didOutput 函数中,每当相机帧发生变化时都会调用该函数。


class ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {

    var busyPerformingVisionRequest = false

    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

        if busyPerformingVisionRequest == false {
            lookForBarcodes(in: pixelBuffer) /// start the vision as many times as possible
        }
    }
}

将检测指示器视图存储在 属性 var previousTrackingView: UIView?

下面是我的 Vision 处理程序,它在 Vision 请求完成时被调用。我首先将 busyPerformingVisionRequest 设置为 false,以便可以发出另一个 Vision 请求。然后我将边界框转换为屏幕坐标并调用 self.drawTrackingView(at: convertedRect).

func resultClassificationTracker(request: VNRequest?, error: Error?) {
    busyPerformingVisionRequest = false
    
    if let results = request?.results {
        if let observation = results.first as? VNBarcodeObservation {
            
            var x = observation.boundingBox.origin.x
            var y = 1 - observation.boundingBox.origin.y
            var height = CGFloat(0) /// ignore the bounding height
            var width = observation.boundingBox.width
            
            /// we're going to do some converting
            let convertedOriginalWidthOfBigImage = aspectRatioWidthOverHeight * deviceSize.height
            let offsetWidth = convertedOriginalWidthOfBigImage - deviceSize.width
            
            /// The pixelbuffer that we got Vision to process is bigger then the device's screen, so we need to adjust it
            let offHalf = offsetWidth / 2
            
            width *= convertedOriginalWidthOfBigImage
            height = width * (CGFloat(9) / CGFloat(16))
            x *= convertedOriginalWidthOfBigImage
            x -= offHalf
            y *= deviceSize.height
            y -= height
            
            let convertedRect = CGRect(x: x, y: y, width: width, height: height)
            
            DispatchQueue.main.async {
                self.drawTrackingView(at: convertedRect)
            }
            
        }
    }
}

每当请求处理程序完成时,将检测指示器设置为新矩形的动画

这是我的功能drawTrackingView。如果已经绘制了一个跟踪矩形视图,它会将其动画化到新框架。如果没有,它只是将其添加为子视图。

func drawTrackingView(at rect: CGRect) {
    if let previousTrackingView = previousTrackingView { /// already drawn one previously, just change the frame now
        UIView.animate(withDuration: 0.8) {
            previousTrackingView.frame = rect
        }
        
    } else { /// add it as a subview
        let trackingView = UIView(frame: rect)
        drawingView.addSubview(trackingView)
        trackingView.backgroundColor = UIColor.blue.withAlphaComponent(0.2)
        trackingView.layer.borderWidth = 3
        trackingView.layer.borderColor = UIColor.blue.cgColor
        
        
        previousTrackingView = trackingView
    }
}

使用Core Motion检测设备移动,并调整检测指标的边框

我首先存储了几个 motion-related 属性。然后,在 viewDidLoad 中,我开始运动更新。

-----ViewController.swift-----

/// motionManager will be what we'll use to get device motion
var motionManager = CMMotionManager()
    
/// this will be the "device’s true orientation in space" (Source: https://nshipster.com/cmdevicemotion/)
var initialAttitude: CMAttitude?
     
/// we'll later read these values to update the highlight's position
var motionX = Double(0) /// aka Roll
var motionY = Double(0) /// aka Pitch

override func viewDidLayoutSubviews() {
    super.viewDidLayoutSubviews()
    
    /// viewDidLoad() is often too early to get the first initial attitude, so we use viewDidLayoutSubviews() instead
    if let currentAttitude = motionManager.deviceMotion?.attitude {
        /// we populate initialAttitude with the current attitude
        initialAttitude = currentAttitude
    }
    
}
override func viewDidLoad() {
    super.viewDidLoad()
    
    /// This is how often we will get device motion updates
    /// 0.03 is more than often enough and is about the rate that the video frame changes
    motionManager.deviceMotionUpdateInterval = 0.03
    
    motionManager.startDeviceMotionUpdates(to: .main) {
        [weak self] (data, error) in
        guard let data = data, error == nil else {
            return
        }
        
        /// This function will be called every 0.03 seconds
        self?.updateTrackingFrames(attitude: data.attitude)
    }

    ...
}

我将每隔 0.03 秒调用一次 updateTrackingFrames,这将读取设备的新物理运动。这是为了减少抖动,比如用户的手在颤抖。

func updateTrackingFrames(attitude: CMAttitude) {
    /// initialAttitude is an optional that points to the reference frame that the device started at
    /// we set this when the device lays out it's subviews on the first launch
    if let initAttitude = initialAttitude {
        
        /// We can now translate the current attitude to the reference frame
        attitude.multiply(byInverseOf: initAttitude)
        
        /// Roll is the movement of the phone left and right, Pitch is forwards and backwards
        let rollValue = attitude.roll.radiansToDegrees
        let pitchValue = attitude.pitch.radiansToDegrees
        
        /// This is a magic number, but for simplicity, we won't do any advanced trigonometry -- also, 3 works pretty well
        let conversion = Double(3)
        
        /// Here, we figure out how much the values changed by comparing against the previous values (motionX and motionY)
        let differenceInX = (rollValue - motionX) * conversion
        let differenceInY = (pitchValue - motionY) * conversion
        
        /// Now we adjust the tracking view's position
        if let previousTrackingView = previousTrackingView {
            previousTrackingView.frame.origin.x += CGFloat(differenceInX)
            previousTrackingView.frame.origin.y += CGFloat(differenceInY)
        }
        
        /// finally, we put the new attitude values into motionX and motionY so we can compare against them in 0.03 seconds (the next time this function is called)
        motionX = rollValue
        motionY = pitchValue
    }
}

这个 Core Motion 实现不是很准确 - 我对调整跟踪指示器框架的乘数常数 (Double(3)) 进行了硬编码。但这足以抵消小的抖动。

这是最终的回购协议:https://github.com/aheze/BarcodeScanner