如何使用 iOS 9 中的 Swift 从 CMSampleBuffer 中提取像素数据进行处理?
How to extract pixel data for processing from CMSampleBuffer using Swift in iOS 9?
我正在 Swift 中编写一个应用程序,它使用了 Scandit 条码扫描 SDK。 SDK 允许您直接访问相机帧并将帧作为 CMSampleBuffer 提供。他们在 Objective-C 中提供了文档,我在 Swift 中工作时遇到了麻烦。我不知道问题是否在于移植代码,或者示例缓冲区本身是否有问题,可能是由于 Core Media 自其文档生成以来发生了变化。
他们的API暴露框架如下(Objective-C):
interface YourViewController () <SBSProcessFrameDelegate>
...
- (void)barcodePicker:(SBSBarcodePicker*)barcodePicker
didProcessFrame:(CMSampleBufferRef)frame
session:(SBSScanSession*)session {
// Process the frame yourself.
}
根据 SO 上的几个答案构建,我尝试使用以下方法处理框架:
let imageBuffer = CMSampleBufferGetImageBuffer(frame)!
CVPixelBufferLockBaseAddress(imageBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.NoneSkipFirst.rawValue | CGBitmapInfo.ByteOrder32Little.rawValue)
let context = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, bitmapInfo.rawValue)
let quartzImage = CGBitmapContextCreateImage(context)
CVPixelBufferUnlockBaseAddress(imageBuffer,0)
let image = UIImage(CGImage: quartzImage!)
但是,这失败了:
Jan 29 09:01:30 Scandit[1308] <Error>: CGBitmapContextCreate: invalid data bytes/row: should be at least 7680 for 8 integer bits/component, 3 components, kCGImageAlphaNoneSkipFirst.
Jan 29 09:01:30 Scandit[1308] <Error>: CGBitmapContextCreateImage: invalid context 0x0. If you want to see the backtrace, please set CG_CONTEXT_SHOW_BACKTRACE environmental variable.
fatal error: unexpectedly found nil while unwrapping an Optional value
致命错误是试图从 quartzImage
解析 UIImage。
width、height、bytesPerRow分别是(在基地址处):
Width: 1920
Height: 1080
Bytes per row: 2904
从委托传来的,根据CMSampleBufferGetFormatDescription(frame)
:
缓冲区包含的内容
Optional(<CMVideoFormatDescription 0x1447dafa0 [0x1a1864b68]> {
mediaType:'vide'
mediaSubType:'420f'
mediaSpecific: {
codecType: '420f' dimensions: 1920 x 1080
}
extensions: {<CFBasicHash 0x1447dba10 [0x1a1864b68]>{type = immutable dict, count = 6,
entries =>
0 : <CFString 0x19d28b678 [0x1a1864b68]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x19d28b6b8 [0x1a1864b68]>{contents = "ITU_R_601_4"}
1 : <CFString 0x19d28b7d8 [0x1a1864b68]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x19d28b698 [0x1a1864b68]>{contents = "ITU_R_709_2"}
2 : <CFString 0x19d2b65c0 [0x1a1864b68]>{contents = "CVBytesPerRow"} = <CFNumber 0xb00000000000b582 [0x1a1864b68]>{value = +2904, type = kCFNumberSInt32Type}
3 : <CFString 0x19d2b6640 [0x1a1864b68]>{contents = "Version"} = <CFNumber 0xb000000000000022 [0x1a1864b68]>{value = +2, type = kCFNumberSInt32Type}
5 : <CFString 0x19d28b758 [0x1a1864b68]>{contents = "CVImageBufferColorPrimaries"} = <CFString 0x19d28b698 [0x1a1864b68]>{contents = "ITU_R_709_2"}
6 : <CFString 0x19d28b818 [0x1a1864b68]>{contents = "CVImageBufferChromaLocationTopField"} = <CFString 0x19d28b878 [0x1a1864b68]>{contents = "Center"}
}
}
})
我知道这里可能有多个 "planes",但即使有:
let pixelBufferBytesPerRow0 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0)
let pixelBufferBytesPerRow1 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 1)
给出:
Pixel buffer bytes per row (Plane 0): 1920
Pixel buffer bytes per row (Plane 1): 1920
我不明白这种差异。
我还尝试单独处理每个像素,因为很明显缓冲区包含某种形式的 YCbCr,但我尝试过的所有方法都失败了。 Scandit API 建议 (Objective-C):
// Get the buffer info for the YCbCrBiPlanar format.
void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
CVPlanarPixelBufferInfo_YCbCrBiPlanar *bufferInfo = (CVPlanarPixelBufferInfo_YCbCrBiPlanar *)baseAddress;
但是,我找不到允许使用 CVPlanarPixelBufferInfo 访问缓冲区信息的 Swift 实现...我尝试的一切都失败了,所以我无法确定 "Y" 的偏移量, "Cr"等
如何访问缓冲区中的像素数据?这是 SDK 传递的 CMSampleBuffer 的问题,还是 iOS9 的问题,还是两者都有?
这不是一个完整的答案,只是一些提示:
Scandit 使用 YCbCrBiPlanar 格式。每个像素有一个 Y 字节,每组 2x2 像素有一个 Cb 和一个 Cr 字节。 Y 值在第一个平面上,Cb 和 Cr 值在第二个平面上。
如果图像是 w x h 像素大,那么第一个平面包含 h 行 w 字节(每行可能还有一些填充)。
第二个平面包含h / 2行w / 2对字节。每对包含一个 Cb 和 Cr 值。同样,每行末尾可能有一些填充。
所以位置(x,y)处像素的Y值可以在地址找到:
Y: baseAddressPlane1 + y * bytesPerRowPlane1 + x
位置(x,y)处的像素值Cb和Cr可以在地址找到:
Cb: baseAddressPlane2 + (y / 2) * bytesPerRowPlan2 + (x / 2) * 2
Cr: baseAddressPlane2 + (y / 2) * bytesPerRowPlan2 + (x / 2) * 2 + 1
除以 2 是舍去小数部分的整数除法。
根据 Codo 的 "hints" 并与 Scandit 文档中的 Objective-C 代码集成,我在 Swift 中制定了一个解决方案。虽然我接受了 Codo 的回答,因为它帮助很大,但我也在回答我自己的问题,希望一个完整的解决方案能在未来帮助某人:
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let lumaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let chromaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1)
let width = CVPixelBufferGetWidth(pixelBuffer)
let height = CVPixelBufferGetHeight(pixelBuffer)
let lumaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let chromaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1)
let lumaBuffer = UnsafeMutablePointer<UInt8>(lumaBaseAddress)
let chromaBuffer = UnsafeMutablePointer<UInt8>(chromaBaseAddress)
var rgbaImage = [UInt8](count: 4*width*height, repeatedValue: 0)
for var x = 0; x < width; x++ {
for var y = 0; y < height; y++ {
let lumaIndex = x+y*lumaBytesPerRow
let chromaIndex = (y/2)*chromaBytesPerRow+(x/2)*2
let yp = lumaBuffer[lumaIndex]
let cb = chromaBuffer[chromaIndex]
let cr = chromaBuffer[chromaIndex+1]
let ri = Double(yp) + 1.402 * (Double(cr) - 128)
let gi = Double(yp) - 0.34414 * (Double(cb) - 128) - 0.71414 * (Double(cr) - 128)
let bi = Double(yp) + 1.772 * (Double(cb) - 128)
let r = UInt8(min(max(ri,0), 255))
let g = UInt8(min(max(gi,0), 255))
let b = UInt8(min(max(bi,0), 255))
rgbaImage[(x + y * width) * 4] = b
rgbaImage[(x + y * width) * 4 + 1] = g
rgbaImage[(x + y * width) * 4 + 2] = r
rgbaImage[(x + y * width) * 4 + 3] = 255
}
}
let colorSpace = CGColorSpaceCreateDeviceRGB()
let dataProvider: CGDataProviderRef = CGDataProviderCreateWithData(nil, rgbaImage, 4 * width * height, nil)!
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.NoneSkipFirst.rawValue | CGBitmapInfo.ByteOrder32Little.rawValue)
let cgImage: CGImageRef = CGImageCreate(width, height, 8, 32, width * 4, colorSpace!, bitmapInfo, dataProvider, nil, true, CGColorRenderingIntent.RenderingIntentDefault)!
let image: UIImage = UIImage(CGImage: cgImage)
CVPixelBufferUnlockBaseAddress(pixelBuffer,0)
尽管迭代了整个 8.3MP 图像,但代码执行速度非常快。我承认我对 Core Media 框架没有深入的了解,但我相信这意味着代码是在 GPU 上执行的。但是,我将不胜感激任何对代码的评论,以提高它的效率,或者改进 "Swiftness",因为我完全是一个业余爱好者。
我正在 Swift 中编写一个应用程序,它使用了 Scandit 条码扫描 SDK。 SDK 允许您直接访问相机帧并将帧作为 CMSampleBuffer 提供。他们在 Objective-C 中提供了文档,我在 Swift 中工作时遇到了麻烦。我不知道问题是否在于移植代码,或者示例缓冲区本身是否有问题,可能是由于 Core Media 自其文档生成以来发生了变化。
他们的API暴露框架如下(Objective-C):
interface YourViewController () <SBSProcessFrameDelegate>
...
- (void)barcodePicker:(SBSBarcodePicker*)barcodePicker
didProcessFrame:(CMSampleBufferRef)frame
session:(SBSScanSession*)session {
// Process the frame yourself.
}
根据 SO 上的几个答案构建,我尝试使用以下方法处理框架:
let imageBuffer = CMSampleBufferGetImageBuffer(frame)!
CVPixelBufferLockBaseAddress(imageBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.NoneSkipFirst.rawValue | CGBitmapInfo.ByteOrder32Little.rawValue)
let context = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, bitmapInfo.rawValue)
let quartzImage = CGBitmapContextCreateImage(context)
CVPixelBufferUnlockBaseAddress(imageBuffer,0)
let image = UIImage(CGImage: quartzImage!)
但是,这失败了:
Jan 29 09:01:30 Scandit[1308] <Error>: CGBitmapContextCreate: invalid data bytes/row: should be at least 7680 for 8 integer bits/component, 3 components, kCGImageAlphaNoneSkipFirst.
Jan 29 09:01:30 Scandit[1308] <Error>: CGBitmapContextCreateImage: invalid context 0x0. If you want to see the backtrace, please set CG_CONTEXT_SHOW_BACKTRACE environmental variable.
fatal error: unexpectedly found nil while unwrapping an Optional value
致命错误是试图从 quartzImage
解析 UIImage。
width、height、bytesPerRow分别是(在基地址处):
Width: 1920
Height: 1080
Bytes per row: 2904
从委托传来的,根据CMSampleBufferGetFormatDescription(frame)
:
Optional(<CMVideoFormatDescription 0x1447dafa0 [0x1a1864b68]> {
mediaType:'vide'
mediaSubType:'420f'
mediaSpecific: {
codecType: '420f' dimensions: 1920 x 1080
}
extensions: {<CFBasicHash 0x1447dba10 [0x1a1864b68]>{type = immutable dict, count = 6,
entries =>
0 : <CFString 0x19d28b678 [0x1a1864b68]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x19d28b6b8 [0x1a1864b68]>{contents = "ITU_R_601_4"}
1 : <CFString 0x19d28b7d8 [0x1a1864b68]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x19d28b698 [0x1a1864b68]>{contents = "ITU_R_709_2"}
2 : <CFString 0x19d2b65c0 [0x1a1864b68]>{contents = "CVBytesPerRow"} = <CFNumber 0xb00000000000b582 [0x1a1864b68]>{value = +2904, type = kCFNumberSInt32Type}
3 : <CFString 0x19d2b6640 [0x1a1864b68]>{contents = "Version"} = <CFNumber 0xb000000000000022 [0x1a1864b68]>{value = +2, type = kCFNumberSInt32Type}
5 : <CFString 0x19d28b758 [0x1a1864b68]>{contents = "CVImageBufferColorPrimaries"} = <CFString 0x19d28b698 [0x1a1864b68]>{contents = "ITU_R_709_2"}
6 : <CFString 0x19d28b818 [0x1a1864b68]>{contents = "CVImageBufferChromaLocationTopField"} = <CFString 0x19d28b878 [0x1a1864b68]>{contents = "Center"}
}
}
})
我知道这里可能有多个 "planes",但即使有:
let pixelBufferBytesPerRow0 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0)
let pixelBufferBytesPerRow1 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 1)
给出:
Pixel buffer bytes per row (Plane 0): 1920
Pixel buffer bytes per row (Plane 1): 1920
我不明白这种差异。
我还尝试单独处理每个像素,因为很明显缓冲区包含某种形式的 YCbCr,但我尝试过的所有方法都失败了。 Scandit API 建议 (Objective-C):
// Get the buffer info for the YCbCrBiPlanar format.
void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
CVPlanarPixelBufferInfo_YCbCrBiPlanar *bufferInfo = (CVPlanarPixelBufferInfo_YCbCrBiPlanar *)baseAddress;
但是,我找不到允许使用 CVPlanarPixelBufferInfo 访问缓冲区信息的 Swift 实现...我尝试的一切都失败了,所以我无法确定 "Y" 的偏移量, "Cr"等
如何访问缓冲区中的像素数据?这是 SDK 传递的 CMSampleBuffer 的问题,还是 iOS9 的问题,还是两者都有?
这不是一个完整的答案,只是一些提示:
Scandit 使用 YCbCrBiPlanar 格式。每个像素有一个 Y 字节,每组 2x2 像素有一个 Cb 和一个 Cr 字节。 Y 值在第一个平面上,Cb 和 Cr 值在第二个平面上。
如果图像是 w x h 像素大,那么第一个平面包含 h 行 w 字节(每行可能还有一些填充)。
第二个平面包含h / 2行w / 2对字节。每对包含一个 Cb 和 Cr 值。同样,每行末尾可能有一些填充。
所以位置(x,y)处像素的Y值可以在地址找到:
Y: baseAddressPlane1 + y * bytesPerRowPlane1 + x
位置(x,y)处的像素值Cb和Cr可以在地址找到:
Cb: baseAddressPlane2 + (y / 2) * bytesPerRowPlan2 + (x / 2) * 2
Cr: baseAddressPlane2 + (y / 2) * bytesPerRowPlan2 + (x / 2) * 2 + 1
除以 2 是舍去小数部分的整数除法。
根据 Codo 的 "hints" 并与 Scandit 文档中的 Objective-C 代码集成,我在 Swift 中制定了一个解决方案。虽然我接受了 Codo 的回答,因为它帮助很大,但我也在回答我自己的问题,希望一个完整的解决方案能在未来帮助某人:
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let lumaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let chromaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1)
let width = CVPixelBufferGetWidth(pixelBuffer)
let height = CVPixelBufferGetHeight(pixelBuffer)
let lumaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let chromaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1)
let lumaBuffer = UnsafeMutablePointer<UInt8>(lumaBaseAddress)
let chromaBuffer = UnsafeMutablePointer<UInt8>(chromaBaseAddress)
var rgbaImage = [UInt8](count: 4*width*height, repeatedValue: 0)
for var x = 0; x < width; x++ {
for var y = 0; y < height; y++ {
let lumaIndex = x+y*lumaBytesPerRow
let chromaIndex = (y/2)*chromaBytesPerRow+(x/2)*2
let yp = lumaBuffer[lumaIndex]
let cb = chromaBuffer[chromaIndex]
let cr = chromaBuffer[chromaIndex+1]
let ri = Double(yp) + 1.402 * (Double(cr) - 128)
let gi = Double(yp) - 0.34414 * (Double(cb) - 128) - 0.71414 * (Double(cr) - 128)
let bi = Double(yp) + 1.772 * (Double(cb) - 128)
let r = UInt8(min(max(ri,0), 255))
let g = UInt8(min(max(gi,0), 255))
let b = UInt8(min(max(bi,0), 255))
rgbaImage[(x + y * width) * 4] = b
rgbaImage[(x + y * width) * 4 + 1] = g
rgbaImage[(x + y * width) * 4 + 2] = r
rgbaImage[(x + y * width) * 4 + 3] = 255
}
}
let colorSpace = CGColorSpaceCreateDeviceRGB()
let dataProvider: CGDataProviderRef = CGDataProviderCreateWithData(nil, rgbaImage, 4 * width * height, nil)!
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.NoneSkipFirst.rawValue | CGBitmapInfo.ByteOrder32Little.rawValue)
let cgImage: CGImageRef = CGImageCreate(width, height, 8, 32, width * 4, colorSpace!, bitmapInfo, dataProvider, nil, true, CGColorRenderingIntent.RenderingIntentDefault)!
let image: UIImage = UIImage(CGImage: cgImage)
CVPixelBufferUnlockBaseAddress(pixelBuffer,0)
尽管迭代了整个 8.3MP 图像,但代码执行速度非常快。我承认我对 Core Media 框架没有深入的了解,但我相信这意味着代码是在 GPU 上执行的。但是,我将不胜感激任何对代码的评论,以提高它的效率,或者改进 "Swiftness",因为我完全是一个业余爱好者。