如何从 VNClassificationObservation 获取对象 rect/coordinates

Question

从 VNClassificationObservation 获取时遇到问题。

我的目标 id 识别对象并显示带有对象名称的弹出窗口，我可以获取名称但无法获取对象坐标或框架。

代码如下：

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: requestOptions)
do {
    try handler.perform([classificationRequest, detectFaceRequest])
} catch {
    print(error)
}

那我处理

func handleClassification(request: VNRequest, error: Error?) {
      guard let observations = request.results as? [VNClassificationObservation] else {
          fatalError("unexpected result type from VNCoreMLRequest")
      }

    // Filter observation
    let filteredOservations = observations[0...10].filter({ [=12=].confidence > 0.1 })

    // Update UI
   DispatchQueue.main.async { [weak self] in

    for  observation in filteredOservations {
            print("observation: ",observation.identifier)
            //HERE: I need to display popup with observation name
    }
  }
}

更新：

lazy var classificationRequest: VNCoreMLRequest = {

    // Load the ML model through its generated class and create a Vision request for it.
    do {
        let model = try VNCoreMLModel(for: Inceptionv3().model)
        let request = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
        request.imageCropAndScaleOption = VNImageCropAndScaleOptionCenterCrop
        return request
    } catch {
        fatalError("can't load Vision ML model: \(error)")
    }
}()

Answer 1

那是因为分类器不 return 对象坐标或框架。分类器仅给出类别列表的概率分布。

你用的是什么型号？

Answer 2

纯分类器模型只能回答"what is this a picture of?"，不能检测和定位图片中的物体。所有free models on the Apple developer site（包括Inception v3）都是这种。

当 Vision 使用此类模型时，它会根据 MLModel 文件中声明的输出将模型识别为分类器，并将 returns VNClassificationObservation 个对象作为输出。

如果您找到或创建了经过训练可以识别和定位对象的模型，您仍然可以将其与 Vision 结合使用。当您将该模型转换为 Core ML 格式时，MLModel 文件将描述多个输出。当 Vision 使用具有多个输出的模型时，它 returns 一个 VNCoreMLFeatureValueObservation 对象的数组 - 一个用于模型的每个输出。

模型声明其输出的方式将决定哪些特征值代表什么。报告分类和边界框的模型可以输出一个字符串和四个双精度数，或者一个字符串和一个多数组等。

附录：这是适用于 iOS 11 和 returns VNCoreMLFeatureValueObservation 的模型：TinyYOLO

Answer 3

要跟踪和识别对象，您必须使用 Darknet 创建自己的模型。我一直在努力解决同样的问题，并使用 TuriCreate 来训练模型，而不是仅仅向框架提供图像，你还必须提供带有边界框的图像。苹果在这里记录了如何创建这些模型： Apple TuriCreate docs

如何从 VNClassificationObservation 获取对象 rect/coordinates

How to get object rect/coordinates from VNClassificationObservation

image-recognition

ios

swift

ios11

coreml