如何从 CIImage 读取 CVPixelBuffer 作为 4 通道浮点格式?
How can I read the CVPixelBuffer as 4 channel float format from a CIImage?
我目前正在尝试对 CIImage 结构进行一些计算。我们在视频帧上使用自定义 Core ML 模型,同时使用 GPU 通过 CIFilter 将这些转换为所需格式。
第一步,我需要对模型生成的两个输出进行一些计算,并找到每个通道像素数据的均值和标准差。
为了测试和技术预览,我能够创建一个 UIImage,读取 CVPixelData,在 CPU 上进行转换和计算。但是在尝试使其适应 GPU 时,我遇到了困难。
过程很简单:
- 将 CIImage BGRA 转换为 LAB 格式。我们不需要 alpha 通道,但保持为 LAB-A
- 对像素数据进行计算。
- Return 从 LAB 到 BGRA,并按原样复制 Alpha 通道。
在当前状态下,我正在使用自定义 CIFilter + Metal 内核将 CIImage 从 RGB 格式转换为 LAB(然后再转换回 RGB)格式。如果中间没有计算,RGB > LAB > RGB 转换按预期工作并且 returns 相同的图像没有任何变形。这告诉我浮点精度没有丢失。
但是当我试图读取中间的像素数据时,我无法获得我正在寻找的浮点值。从 LAB 格式的 CIImage 创建的 CVPixelBuffer 给我的值始终为零。尝试了几种不同的 OSType 格式,如 kCVPixelFormatType_64RGBAHalf
、kCVPixelFormatType_128RGBAFloat
、kCVPixelFormatType_32ARGB
等,其中 none 返回浮点值。但是,如果我从另一张图片读取数据,我总是会按预期获得 UInt8 值...
所以我的问题如标题所示“如何从 CIImage 读取 CVPixelBuffer 作为 4 通道浮点格式?”
流程的简化Swift和Metal代码如下。
let ciRgbToLab = CIConvertRGBToLAB() // CIFilter using metal for kernel
let ciLabToRgb = CIConvertLABToRGB() // CIFilter using metal for kernel
ciRgbToLab.inputImage = source // "source" is a CIImage
guard let sourceLab = ciRgbToLab.outputImage else { throw ... }
ciRgbToLab.inputImage = target // "target" is a CIImage
guard let targetLab = ciRgbToLab.outputImage { throw ... }
// Get the CVPixelBuffer and lock the data.
guard let sourceBuffer = sourceLab.cvPixelBuffer else { throw ... }
CVPixelBufferLockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
defer {
CVPixelBufferUnlockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
}
// Access to the data
guard let sourceAddress = CVPixelBufferGetBaseAddress(sourceBuffer) { throw ... }
let sourceDataSize = CVPixelBufferGetDataSize(sourceBuffer)
let sourceData = sourceAddress.bindMemory(to: CGFloat.self, capacity: sourceDataSize)
// ... do calculations
// ... generates a new CIImage named "targetTransfered"
ciLabToRgb.inputImage = targetTransfered //*
guard let rgbFinal = ciLabToRgb.outputImage else { throw ... }
//* If "targetTransfered" is replaced with "targetLab", we get the exact image as "target".
#include <metal_stdlib>
using namespace metal;
#include <CoreImage/CoreImage.h>
extern "C" {
namespace coreimage {
float4 xyzToLabConversion(float4 pixel) {
...
return float4(l, a, b, pixel.a);
}
float4 rgbToXyzConversion(float4 pixel) {
...
return float4(x, y, z, pixel.a);
}
float4 rgbToLab(sample_t s) {
float4 xyz = rgbToXyzConversion(s);
float4 lab = xyzToLabConversion(xyz);
return lab;
}
float4 xyzToRgbConversion(float4 pixel) {
...
return float4(R, G, B, pixel.a);
}
float4 labToXyzConversion(float4 pixel) {
...
return float4(X, Y, Z, pixel.a);
}
float4 labtoRgb(sample_t s) {
float4 xyz = labToXyzConversion(s);
float4 rgb = xyzToRgbConversion(xyz);
return rgb;
}
}
}
这是我用来将 CIImage 转换为 CVPixelBuffer 的扩展。由于图像是由同一来源在设备上创建的,因此它始终采用 BGRA 格式。我不知道如何将其转换为浮点值...
extension CIImage {
var cvPixelBuffer: CVPixelBuffer? {
let attrs = [
kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferMetalCompatibilityKey: kCFBooleanTrue
] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(self.extent.width),
Int(self.extent.height),
kCVPixelFormatType_32BGRA,
attrs,
&pixelBuffer)
guard status == kCVReturnSuccess else { return nil }
guard let buffer = pixelBuffer else { return nil }
CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags.init(rawValue: 0))
let context = CIContext()
context.render(self, to: buffer)
CVPixelBufferUnlockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
}
PS:我删除了金属内核代码以适应此处。如果您需要 RGB > LAB > RGB 转换,请给我发消息,我很乐意分享滤镜。
很奇怪你得到全零,尤其是当你将格式设置为kCVPixelFormatType_128RGBAFloat
...
但是,我强烈建议您查看 CIImageProcessorKernel,它是为这个用例制作的:向 Core Image 管道添加自定义(可能基于 CPU)处理步骤。在 process
函数中,您可以通过 MTLTexture
、CVPixelBuffer
访问输入和输出缓冲区,甚至可以直接访问 baseAddress
.
这是我编写的示例内核,用于使用 Metal Performance Shaders 计算输入图像的均值和方差,并以 2x1 像素返回它们 CIImage
:
import CoreImage
import MetalPerformanceShaders
/// Processing kernel that computes the mean and the variance of a given image and stores
/// those values in a 2x1 pixel return image.
class MeanVarianceKernel: CIImageProcessorKernel {
override class func roi(forInput input: Int32, arguments: [String : Any]?, outputRect: CGRect) -> CGRect {
// we need to read the full extend of the input
return arguments?["inputExtent"] as? CGRect ?? outputRect
}
override class var outputFormat: CIFormat {
return .RGBAf
}
override class var synchronizeInputs: Bool {
// no need to wait for CPU synchronization since the processing is also happening on the GPU
return false
}
/// Convenience method for calling the `apply` method from outside.
class func apply(to input: CIImage) -> CIImage {
// pass the extent of the input as argument since we need to know the full extend in the ROI callback above
return try! self.apply(withExtent: CGRect(x: 0, y: 0, width: 2, height: 1), inputs: [input], arguments: ["inputExtent": input.extent])
}
override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws {
guard
let commandBuffer = output.metalCommandBuffer,
let input = inputs?.first,
let sourceTexture = input.metalTexture,
let destinationTexture = output.metalTexture
else {
return
}
let meanVarianceShader = MPSImageStatisticsMeanAndVariance(device: commandBuffer.device)
meanVarianceShader.encode(commandBuffer: commandBuffer, sourceTexture: sourceTexture, destinationTexture: destinationTexture)
}
}
它可以像这样轻松地添加到过滤器管道中:
let meanVariance: CIImage = MeanVarianceKernel.apply(to: inputImage)
我目前正在尝试对 CIImage 结构进行一些计算。我们在视频帧上使用自定义 Core ML 模型,同时使用 GPU 通过 CIFilter 将这些转换为所需格式。
第一步,我需要对模型生成的两个输出进行一些计算,并找到每个通道像素数据的均值和标准差。
为了测试和技术预览,我能够创建一个 UIImage,读取 CVPixelData,在 CPU 上进行转换和计算。但是在尝试使其适应 GPU 时,我遇到了困难。
过程很简单:
- 将 CIImage BGRA 转换为 LAB 格式。我们不需要 alpha 通道,但保持为 LAB-A
- 对像素数据进行计算。
- Return 从 LAB 到 BGRA,并按原样复制 Alpha 通道。
在当前状态下,我正在使用自定义 CIFilter + Metal 内核将 CIImage 从 RGB 格式转换为 LAB(然后再转换回 RGB)格式。如果中间没有计算,RGB > LAB > RGB 转换按预期工作并且 returns 相同的图像没有任何变形。这告诉我浮点精度没有丢失。
但是当我试图读取中间的像素数据时,我无法获得我正在寻找的浮点值。从 LAB 格式的 CIImage 创建的 CVPixelBuffer 给我的值始终为零。尝试了几种不同的 OSType 格式,如 kCVPixelFormatType_64RGBAHalf
、kCVPixelFormatType_128RGBAFloat
、kCVPixelFormatType_32ARGB
等,其中 none 返回浮点值。但是,如果我从另一张图片读取数据,我总是会按预期获得 UInt8 值...
所以我的问题如标题所示“如何从 CIImage 读取 CVPixelBuffer 作为 4 通道浮点格式?”
流程的简化Swift和Metal代码如下。
let ciRgbToLab = CIConvertRGBToLAB() // CIFilter using metal for kernel
let ciLabToRgb = CIConvertLABToRGB() // CIFilter using metal for kernel
ciRgbToLab.inputImage = source // "source" is a CIImage
guard let sourceLab = ciRgbToLab.outputImage else { throw ... }
ciRgbToLab.inputImage = target // "target" is a CIImage
guard let targetLab = ciRgbToLab.outputImage { throw ... }
// Get the CVPixelBuffer and lock the data.
guard let sourceBuffer = sourceLab.cvPixelBuffer else { throw ... }
CVPixelBufferLockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
defer {
CVPixelBufferUnlockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
}
// Access to the data
guard let sourceAddress = CVPixelBufferGetBaseAddress(sourceBuffer) { throw ... }
let sourceDataSize = CVPixelBufferGetDataSize(sourceBuffer)
let sourceData = sourceAddress.bindMemory(to: CGFloat.self, capacity: sourceDataSize)
// ... do calculations
// ... generates a new CIImage named "targetTransfered"
ciLabToRgb.inputImage = targetTransfered //*
guard let rgbFinal = ciLabToRgb.outputImage else { throw ... }
//* If "targetTransfered" is replaced with "targetLab", we get the exact image as "target".
#include <metal_stdlib>
using namespace metal;
#include <CoreImage/CoreImage.h>
extern "C" {
namespace coreimage {
float4 xyzToLabConversion(float4 pixel) {
...
return float4(l, a, b, pixel.a);
}
float4 rgbToXyzConversion(float4 pixel) {
...
return float4(x, y, z, pixel.a);
}
float4 rgbToLab(sample_t s) {
float4 xyz = rgbToXyzConversion(s);
float4 lab = xyzToLabConversion(xyz);
return lab;
}
float4 xyzToRgbConversion(float4 pixel) {
...
return float4(R, G, B, pixel.a);
}
float4 labToXyzConversion(float4 pixel) {
...
return float4(X, Y, Z, pixel.a);
}
float4 labtoRgb(sample_t s) {
float4 xyz = labToXyzConversion(s);
float4 rgb = xyzToRgbConversion(xyz);
return rgb;
}
}
}
这是我用来将 CIImage 转换为 CVPixelBuffer 的扩展。由于图像是由同一来源在设备上创建的,因此它始终采用 BGRA 格式。我不知道如何将其转换为浮点值...
extension CIImage {
var cvPixelBuffer: CVPixelBuffer? {
let attrs = [
kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferMetalCompatibilityKey: kCFBooleanTrue
] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(self.extent.width),
Int(self.extent.height),
kCVPixelFormatType_32BGRA,
attrs,
&pixelBuffer)
guard status == kCVReturnSuccess else { return nil }
guard let buffer = pixelBuffer else { return nil }
CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags.init(rawValue: 0))
let context = CIContext()
context.render(self, to: buffer)
CVPixelBufferUnlockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
}
PS:我删除了金属内核代码以适应此处。如果您需要 RGB > LAB > RGB 转换,请给我发消息,我很乐意分享滤镜。
很奇怪你得到全零,尤其是当你将格式设置为kCVPixelFormatType_128RGBAFloat
...
但是,我强烈建议您查看 CIImageProcessorKernel,它是为这个用例制作的:向 Core Image 管道添加自定义(可能基于 CPU)处理步骤。在 process
函数中,您可以通过 MTLTexture
、CVPixelBuffer
访问输入和输出缓冲区,甚至可以直接访问 baseAddress
.
这是我编写的示例内核,用于使用 Metal Performance Shaders 计算输入图像的均值和方差,并以 2x1 像素返回它们 CIImage
:
import CoreImage
import MetalPerformanceShaders
/// Processing kernel that computes the mean and the variance of a given image and stores
/// those values in a 2x1 pixel return image.
class MeanVarianceKernel: CIImageProcessorKernel {
override class func roi(forInput input: Int32, arguments: [String : Any]?, outputRect: CGRect) -> CGRect {
// we need to read the full extend of the input
return arguments?["inputExtent"] as? CGRect ?? outputRect
}
override class var outputFormat: CIFormat {
return .RGBAf
}
override class var synchronizeInputs: Bool {
// no need to wait for CPU synchronization since the processing is also happening on the GPU
return false
}
/// Convenience method for calling the `apply` method from outside.
class func apply(to input: CIImage) -> CIImage {
// pass the extent of the input as argument since we need to know the full extend in the ROI callback above
return try! self.apply(withExtent: CGRect(x: 0, y: 0, width: 2, height: 1), inputs: [input], arguments: ["inputExtent": input.extent])
}
override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws {
guard
let commandBuffer = output.metalCommandBuffer,
let input = inputs?.first,
let sourceTexture = input.metalTexture,
let destinationTexture = output.metalTexture
else {
return
}
let meanVarianceShader = MPSImageStatisticsMeanAndVariance(device: commandBuffer.device)
meanVarianceShader.encode(commandBuffer: commandBuffer, sourceTexture: sourceTexture, destinationTexture: destinationTexture)
}
}
它可以像这样轻松地添加到过滤器管道中:
let meanVariance: CIImage = MeanVarianceKernel.apply(to: inputImage)