如何以最高的置信度跟踪条形码
How to track the barcode with highest confidence
我正在使用视觉框架来检测条形码。我想在实时视频中以最高置信度在条形码周围显示一个矩形,意思是,我想将该矩形跟踪到实时预览中看到的条形码。
所以我有这个代码来检测 roi 内的条形码。
lazy var barcodeRequest: VNDetectBarcodesRequest = {
let barcodeRequest = VNDetectBarcodesRequest {[weak self] request, error in
guard error == nil else {
print ("ERRO: \(error?.localizedDescription ?? "error")")
return
}
self?.resultClassification(request)
}
barcodeRequest.regionOfInterest = CGRect(x: 0,
y: 0.3,
width: 1,
height: 0.4)
return barcodeRequest
}()
检测到条形码时将触发此方法
func resultClassification(_ request: VNRequest) {
guard let barcodes = request.results,
let potentialCodes = barcodes as? [VNBarcodeObservation]
else { return }
// choose the bar code with highestConfidence
let highestConfidenceBarcodeDetected = potentialCodes.max(by: {[=11=].confidence < .confidence})
// do something with highestConfidenceBarcodeDetected
// 1
}
这是我的问题。
现在我有了置信度最高的条形码,我想在屏幕上跟踪它。所以,我想我必须在 // 1
.
添加代码
但在此之前我必须为跟踪器定义这个:
var inputObservation:VNDetectedObjectObservation!
lazy var barcodeTrackingRequest: VNTrackObjectRequest = {
let barcodeTrackingRequest = VNTrackObjectRequest(detectedObjectObservation: inputObservation) { [weak self] request, error in
guard error == nil else {
print("Detection error: \(String(describing: error)).")
return
}
self?.resultClassificationTracker(request)
}
return barcodeTrackingRequest
}()
func resultClassificationTracker(_ request:VNRequest) {
// all I want from this is to store the boundingbox on a var
}
现在,我如何连接这两段代码,以便在每次获得跟踪器的边界框值时 resultClassificationTracker
触发?
前段时间我做了类似的事情,写了一个article在上面。它适用于 VNRecognizeTextRequest
而不是 VNDetectBarcodesRequest
,但它很相似。这就是我所做的:
- 连续执行
VNImageRequestHandler
(一旦完成,再次开始)
- 将检测指示器视图存储在 属性
var previousTrackingView: UIView?
中
- 每当请求处理程序完成时,将检测指示器设置为新矩形的动画
- 使用Core Motion检测设备移动,并调整检测指标的边框
结果如下:
如您所见,height/y 坐标不是很准确。我的猜测是 Vision 只需要一条水平线来扫描条形码——就像杂货店里的激光扫描仪一样——所以它不会 return 全高。但那是另一个问题。
连续执行VNImageRequestHandler
(一旦完成,再次开始)
为此,我正在制作一个 属性 busyPerformingVisionRequest
,每当它为 false 时,我都会调用 Vision 请求。这是在 didOutput
函数中,每当相机帧发生变化时都会调用该函数。
class ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
var busyPerformingVisionRequest = false
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
if busyPerformingVisionRequest == false {
lookForBarcodes(in: pixelBuffer) /// start the vision as many times as possible
}
}
}
将检测指示器视图存储在 属性 var previousTrackingView: UIView?
下面是我的 Vision 处理程序,它在 Vision 请求完成时被调用。我首先将 busyPerformingVisionRequest
设置为 false,以便可以发出另一个 Vision 请求。然后我将边界框转换为屏幕坐标并调用 self.drawTrackingView(at: convertedRect)
.
func resultClassificationTracker(request: VNRequest?, error: Error?) {
busyPerformingVisionRequest = false
if let results = request?.results {
if let observation = results.first as? VNBarcodeObservation {
var x = observation.boundingBox.origin.x
var y = 1 - observation.boundingBox.origin.y
var height = CGFloat(0) /// ignore the bounding height
var width = observation.boundingBox.width
/// we're going to do some converting
let convertedOriginalWidthOfBigImage = aspectRatioWidthOverHeight * deviceSize.height
let offsetWidth = convertedOriginalWidthOfBigImage - deviceSize.width
/// The pixelbuffer that we got Vision to process is bigger then the device's screen, so we need to adjust it
let offHalf = offsetWidth / 2
width *= convertedOriginalWidthOfBigImage
height = width * (CGFloat(9) / CGFloat(16))
x *= convertedOriginalWidthOfBigImage
x -= offHalf
y *= deviceSize.height
y -= height
let convertedRect = CGRect(x: x, y: y, width: width, height: height)
DispatchQueue.main.async {
self.drawTrackingView(at: convertedRect)
}
}
}
}
每当请求处理程序完成时,将检测指示器设置为新矩形的动画
这是我的功能drawTrackingView
。如果已经绘制了一个跟踪矩形视图,它会将其动画化到新框架。如果没有,它只是将其添加为子视图。
func drawTrackingView(at rect: CGRect) {
if let previousTrackingView = previousTrackingView { /// already drawn one previously, just change the frame now
UIView.animate(withDuration: 0.8) {
previousTrackingView.frame = rect
}
} else { /// add it as a subview
let trackingView = UIView(frame: rect)
drawingView.addSubview(trackingView)
trackingView.backgroundColor = UIColor.blue.withAlphaComponent(0.2)
trackingView.layer.borderWidth = 3
trackingView.layer.borderColor = UIColor.blue.cgColor
previousTrackingView = trackingView
}
}
使用Core Motion检测设备移动,并调整检测指标的边框
我首先存储了几个 motion-related 属性。然后,在 viewDidLoad
中,我开始运动更新。
-----ViewController.swift-----
/// motionManager will be what we'll use to get device motion
var motionManager = CMMotionManager()
/// this will be the "device’s true orientation in space" (Source: https://nshipster.com/cmdevicemotion/)
var initialAttitude: CMAttitude?
/// we'll later read these values to update the highlight's position
var motionX = Double(0) /// aka Roll
var motionY = Double(0) /// aka Pitch
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
/// viewDidLoad() is often too early to get the first initial attitude, so we use viewDidLayoutSubviews() instead
if let currentAttitude = motionManager.deviceMotion?.attitude {
/// we populate initialAttitude with the current attitude
initialAttitude = currentAttitude
}
}
override func viewDidLoad() {
super.viewDidLoad()
/// This is how often we will get device motion updates
/// 0.03 is more than often enough and is about the rate that the video frame changes
motionManager.deviceMotionUpdateInterval = 0.03
motionManager.startDeviceMotionUpdates(to: .main) {
[weak self] (data, error) in
guard let data = data, error == nil else {
return
}
/// This function will be called every 0.03 seconds
self?.updateTrackingFrames(attitude: data.attitude)
}
...
}
我将每隔 0.03 秒调用一次 updateTrackingFrames
,这将读取设备的新物理运动。这是为了减少抖动,比如用户的手在颤抖。
func updateTrackingFrames(attitude: CMAttitude) {
/// initialAttitude is an optional that points to the reference frame that the device started at
/// we set this when the device lays out it's subviews on the first launch
if let initAttitude = initialAttitude {
/// We can now translate the current attitude to the reference frame
attitude.multiply(byInverseOf: initAttitude)
/// Roll is the movement of the phone left and right, Pitch is forwards and backwards
let rollValue = attitude.roll.radiansToDegrees
let pitchValue = attitude.pitch.radiansToDegrees
/// This is a magic number, but for simplicity, we won't do any advanced trigonometry -- also, 3 works pretty well
let conversion = Double(3)
/// Here, we figure out how much the values changed by comparing against the previous values (motionX and motionY)
let differenceInX = (rollValue - motionX) * conversion
let differenceInY = (pitchValue - motionY) * conversion
/// Now we adjust the tracking view's position
if let previousTrackingView = previousTrackingView {
previousTrackingView.frame.origin.x += CGFloat(differenceInX)
previousTrackingView.frame.origin.y += CGFloat(differenceInY)
}
/// finally, we put the new attitude values into motionX and motionY so we can compare against them in 0.03 seconds (the next time this function is called)
motionX = rollValue
motionY = pitchValue
}
}
这个 Core Motion 实现不是很准确 - 我对调整跟踪指示器框架的乘数常数 (Double(3)
) 进行了硬编码。但这足以抵消小的抖动。
这是最终的回购协议:https://github.com/aheze/BarcodeScanner
我正在使用视觉框架来检测条形码。我想在实时视频中以最高置信度在条形码周围显示一个矩形,意思是,我想将该矩形跟踪到实时预览中看到的条形码。
所以我有这个代码来检测 roi 内的条形码。
lazy var barcodeRequest: VNDetectBarcodesRequest = {
let barcodeRequest = VNDetectBarcodesRequest {[weak self] request, error in
guard error == nil else {
print ("ERRO: \(error?.localizedDescription ?? "error")")
return
}
self?.resultClassification(request)
}
barcodeRequest.regionOfInterest = CGRect(x: 0,
y: 0.3,
width: 1,
height: 0.4)
return barcodeRequest
}()
检测到条形码时将触发此方法
func resultClassification(_ request: VNRequest) {
guard let barcodes = request.results,
let potentialCodes = barcodes as? [VNBarcodeObservation]
else { return }
// choose the bar code with highestConfidence
let highestConfidenceBarcodeDetected = potentialCodes.max(by: {[=11=].confidence < .confidence})
// do something with highestConfidenceBarcodeDetected
// 1
}
这是我的问题。
现在我有了置信度最高的条形码,我想在屏幕上跟踪它。所以,我想我必须在 // 1
.
但在此之前我必须为跟踪器定义这个:
var inputObservation:VNDetectedObjectObservation!
lazy var barcodeTrackingRequest: VNTrackObjectRequest = {
let barcodeTrackingRequest = VNTrackObjectRequest(detectedObjectObservation: inputObservation) { [weak self] request, error in
guard error == nil else {
print("Detection error: \(String(describing: error)).")
return
}
self?.resultClassificationTracker(request)
}
return barcodeTrackingRequest
}()
func resultClassificationTracker(_ request:VNRequest) {
// all I want from this is to store the boundingbox on a var
}
现在,我如何连接这两段代码,以便在每次获得跟踪器的边界框值时 resultClassificationTracker
触发?
前段时间我做了类似的事情,写了一个article在上面。它适用于 VNRecognizeTextRequest
而不是 VNDetectBarcodesRequest
,但它很相似。这就是我所做的:
- 连续执行
VNImageRequestHandler
(一旦完成,再次开始) - 将检测指示器视图存储在 属性
var previousTrackingView: UIView?
中
- 每当请求处理程序完成时,将检测指示器设置为新矩形的动画
- 使用Core Motion检测设备移动,并调整检测指标的边框
结果如下:
如您所见,height/y 坐标不是很准确。我的猜测是 Vision 只需要一条水平线来扫描条形码——就像杂货店里的激光扫描仪一样——所以它不会 return 全高。但那是另一个问题。
连续执行VNImageRequestHandler
(一旦完成,再次开始)
为此,我正在制作一个 属性 busyPerformingVisionRequest
,每当它为 false 时,我都会调用 Vision 请求。这是在 didOutput
函数中,每当相机帧发生变化时都会调用该函数。
class ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
var busyPerformingVisionRequest = false
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
if busyPerformingVisionRequest == false {
lookForBarcodes(in: pixelBuffer) /// start the vision as many times as possible
}
}
}
将检测指示器视图存储在 属性 var previousTrackingView: UIView?
下面是我的 Vision 处理程序,它在 Vision 请求完成时被调用。我首先将 busyPerformingVisionRequest
设置为 false,以便可以发出另一个 Vision 请求。然后我将边界框转换为屏幕坐标并调用 self.drawTrackingView(at: convertedRect)
.
func resultClassificationTracker(request: VNRequest?, error: Error?) {
busyPerformingVisionRequest = false
if let results = request?.results {
if let observation = results.first as? VNBarcodeObservation {
var x = observation.boundingBox.origin.x
var y = 1 - observation.boundingBox.origin.y
var height = CGFloat(0) /// ignore the bounding height
var width = observation.boundingBox.width
/// we're going to do some converting
let convertedOriginalWidthOfBigImage = aspectRatioWidthOverHeight * deviceSize.height
let offsetWidth = convertedOriginalWidthOfBigImage - deviceSize.width
/// The pixelbuffer that we got Vision to process is bigger then the device's screen, so we need to adjust it
let offHalf = offsetWidth / 2
width *= convertedOriginalWidthOfBigImage
height = width * (CGFloat(9) / CGFloat(16))
x *= convertedOriginalWidthOfBigImage
x -= offHalf
y *= deviceSize.height
y -= height
let convertedRect = CGRect(x: x, y: y, width: width, height: height)
DispatchQueue.main.async {
self.drawTrackingView(at: convertedRect)
}
}
}
}
每当请求处理程序完成时,将检测指示器设置为新矩形的动画
这是我的功能drawTrackingView
。如果已经绘制了一个跟踪矩形视图,它会将其动画化到新框架。如果没有,它只是将其添加为子视图。
func drawTrackingView(at rect: CGRect) {
if let previousTrackingView = previousTrackingView { /// already drawn one previously, just change the frame now
UIView.animate(withDuration: 0.8) {
previousTrackingView.frame = rect
}
} else { /// add it as a subview
let trackingView = UIView(frame: rect)
drawingView.addSubview(trackingView)
trackingView.backgroundColor = UIColor.blue.withAlphaComponent(0.2)
trackingView.layer.borderWidth = 3
trackingView.layer.borderColor = UIColor.blue.cgColor
previousTrackingView = trackingView
}
}
使用Core Motion检测设备移动,并调整检测指标的边框
我首先存储了几个 motion-related 属性。然后,在 viewDidLoad
中,我开始运动更新。
-----ViewController.swift-----
/// motionManager will be what we'll use to get device motion
var motionManager = CMMotionManager()
/// this will be the "device’s true orientation in space" (Source: https://nshipster.com/cmdevicemotion/)
var initialAttitude: CMAttitude?
/// we'll later read these values to update the highlight's position
var motionX = Double(0) /// aka Roll
var motionY = Double(0) /// aka Pitch
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
/// viewDidLoad() is often too early to get the first initial attitude, so we use viewDidLayoutSubviews() instead
if let currentAttitude = motionManager.deviceMotion?.attitude {
/// we populate initialAttitude with the current attitude
initialAttitude = currentAttitude
}
}
override func viewDidLoad() {
super.viewDidLoad()
/// This is how often we will get device motion updates
/// 0.03 is more than often enough and is about the rate that the video frame changes
motionManager.deviceMotionUpdateInterval = 0.03
motionManager.startDeviceMotionUpdates(to: .main) {
[weak self] (data, error) in
guard let data = data, error == nil else {
return
}
/// This function will be called every 0.03 seconds
self?.updateTrackingFrames(attitude: data.attitude)
}
...
}
我将每隔 0.03 秒调用一次 updateTrackingFrames
,这将读取设备的新物理运动。这是为了减少抖动,比如用户的手在颤抖。
func updateTrackingFrames(attitude: CMAttitude) {
/// initialAttitude is an optional that points to the reference frame that the device started at
/// we set this when the device lays out it's subviews on the first launch
if let initAttitude = initialAttitude {
/// We can now translate the current attitude to the reference frame
attitude.multiply(byInverseOf: initAttitude)
/// Roll is the movement of the phone left and right, Pitch is forwards and backwards
let rollValue = attitude.roll.radiansToDegrees
let pitchValue = attitude.pitch.radiansToDegrees
/// This is a magic number, but for simplicity, we won't do any advanced trigonometry -- also, 3 works pretty well
let conversion = Double(3)
/// Here, we figure out how much the values changed by comparing against the previous values (motionX and motionY)
let differenceInX = (rollValue - motionX) * conversion
let differenceInY = (pitchValue - motionY) * conversion
/// Now we adjust the tracking view's position
if let previousTrackingView = previousTrackingView {
previousTrackingView.frame.origin.x += CGFloat(differenceInX)
previousTrackingView.frame.origin.y += CGFloat(differenceInY)
}
/// finally, we put the new attitude values into motionX and motionY so we can compare against them in 0.03 seconds (the next time this function is called)
motionX = rollValue
motionY = pitchValue
}
}
这个 Core Motion 实现不是很准确 - 我对调整跟踪指示器框架的乘数常数 (Double(3)
) 进行了硬编码。但这足以抵消小的抖动。
这是最终的回购协议:https://github.com/aheze/BarcodeScanner