如何使用 mlmodel 对 UIImage 进行预测？

Question

我一直在从事一个项目，该项目涉及检测图像中的人是快乐还是悲伤。为此，我正在使用机器学习模型。我已经将 python 模型转换为 .mlmodel 并在应用程序中实现了它。该模型需要 48x48 灰度图像。我需要有关如何将我的 UIImage 转换成这种格式的帮助。

Link 到项目：

https://github.com/LOLIPOP-INTELLIGENCE/happy_faces_v1

任何帮助将不胜感激！谢谢

Answer 1

调整图像大小并将其转换为灰度图像的一种有效方法是使用 vImage。见 Converting Color Images to Grayscale:

例如：

/*
 The Core Graphics image representation of the source asset.
 */
let cgImage: CGImage = {
    guard let cgImage = #imageLiteral(resourceName: "image.jpg").cgImage else {
        fatalError("Unable to get CGImage")
    }

    return cgImage
}()

/*
 The format of the source asset.
 */
lazy var format: vImage_CGImageFormat = {
    guard
        let sourceColorSpace = cgImage.colorSpace else {
            fatalError("Unable to get color space")
    }

    return vImage_CGImageFormat(
        bitsPerComponent: UInt32(cgImage.bitsPerComponent),
        bitsPerPixel: UInt32(cgImage.bitsPerPixel),
        colorSpace: Unmanaged.passRetained(sourceColorSpace),
        bitmapInfo: cgImage.bitmapInfo,
        version: 0,
        decode: nil,
        renderingIntent: cgImage.renderingIntent)
}()

/*
 The vImage buffer containing a scaled down copy of the source asset.
 */
lazy var sourceBuffer: vImage_Buffer = {
    var sourceImageBuffer = vImage_Buffer()

    vImageBuffer_InitWithCGImage(&sourceImageBuffer,
                                 &format,
                                 nil,
                                 cgImage,
                                 vImage_Flags(kvImageNoFlags))

    var scaledBuffer = vImage_Buffer()

    vImageBuffer_Init(&scaledBuffer,
                      48,
                      48,
                      format.bitsPerPixel,
                      vImage_Flags(kvImageNoFlags))

    vImageScale_ARGB8888(&sourceImageBuffer,
                         &scaledBuffer,
                         nil,
                         vImage_Flags(kvImageNoFlags))

    return scaledBuffer
}()

/*
 The 1-channel, 8-bit vImage buffer used as the operation destination.
 */
lazy var destinationBuffer: vImage_Buffer = {
    var destinationBuffer = vImage_Buffer()

    vImageBuffer_Init(&destinationBuffer,
                      sourceBuffer.height,
                      sourceBuffer.width,
                      8,
                      vImage_Flags(kvImageNoFlags))

    return destinationBuffer
}()

注意，我更改了 Apple 的示例，其中调用了 vImageBuffer_Init 以强制使用 48×48 尺寸。

然后：

// Declare the three coefficients that model the eye's sensitivity
// to color.
let redCoefficient: Float = 0.2126
let greenCoefficient: Float = 0.7152
let blueCoefficient: Float = 0.0722

// Create a 1D matrix containing the three luma coefficients that
// specify the color-to-grayscale conversion.
let divisor: Int32 = 0x1000
let fDivisor = Float(divisor)

var coefficientsMatrix = [
    Int16(redCoefficient * fDivisor),
    Int16(greenCoefficient * fDivisor),
    Int16(blueCoefficient * fDivisor)
]

// Use the matrix of coefficients to compute the scalar luminance by
// returning the dot product of each RGB pixel and the coefficients
// matrix.
let preBias: [Int16] = [0, 0, 0, 0]
let postBias: Int32 = 0

vImageMatrixMultiply_ARGB8888ToPlanar8(&sourceBuffer,
                                       &destinationBuffer,
                                       &coefficientsMatrix,
                                       divisor,
                                       preBias,
                                       postBias,
                                       vImage_Flags(kvImageNoFlags))

// Create a 1-channel, 8-bit grayscale format that's used to
// generate a displayable image.
var monoFormat = vImage_CGImageFormat(
    bitsPerComponent: 8,
    bitsPerPixel: 8,
    colorSpace: Unmanaged.passRetained(CGColorSpaceCreateDeviceGray()),
    bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.none.rawValue),
    version: 0,
    decode: nil,
    renderingIntent: .defaultIntent)

// Create a Core Graphics image from the grayscale destination buffer.
let result = vImageCreateCGImageFromBuffer(
    &destinationBuffer,
    &monoFormat,
    nil,
    nil,
    vImage_Flags(kvImageNoFlags),
    nil)

// Display the grayscale result.
if let result = result {
    imageView.image = UIImage(cgImage: result.takeRetainedValue())
}

现在，假设原始图像已经是方形的。如果没有，您可以在创建 vImage_Buffer 来源之前裁剪图像：

lazy var sourceBuffer: vImage_Buffer = {
    var sourceImageBuffer = vImage_Buffer()

    let width = min(cgImage.width, cgImage.height)
    let rect = CGRect(x: (cgImage.width - width) / 2,
                      y: (cgImage.height - width) / 2,
                      width: width,
                      height: width)
    let croppedImage = cgImage.cropping(to: rect)!

    vImageBuffer_InitWithCGImage(&sourceImageBuffer,
                                 &format,
                                 nil,
                                 croppedImage,
                                 vImage_Flags(kvImageNoFlags))

    var scaledBuffer = vImage_Buffer()

    vImageBuffer_Init(&scaledBuffer,
                      48,
                      48,
                      format.bitsPerPixel,
                      vImage_Flags(kvImageNoFlags))

    vImageScale_ARGB8888(&sourceImageBuffer,
                         &scaledBuffer,
                         nil,
                         vImage_Flags(kvImageNoFlags))

    return scaledBuffer
}()

如何使用 mlmodel 对 UIImage 进行预测？

how to use mlmodel to make predictions on UIImage?

image

function

grayscale

swift

coreml