iOS 通过 AVAssetWriter 反转音频

iOS reverse audio through AVAssetWriter

我正在尝试使用 AVAsset 和 AVAssetWriter 在 iOS 中反转音频。 以下代码有效,但输出文件比输入文件短。 例如,输入文件有 1:59 持续时间,但输出 1:50 具有相同的音频内容。

- (void)reverse:(AVAsset *)asset
{
AVAssetReader* reader = [[AVAssetReader alloc] initWithAsset:asset error:nil];

AVAssetTrack* audioTrack = [[asset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];

NSMutableDictionary* audioReadSettings = [NSMutableDictionary dictionary];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
                     forKey:AVFormatIDKey];

AVAssetReaderTrackOutput* readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:audioTrack outputSettings:audioReadSettings];
[reader addOutput:readerOutput];
[reader startReading];

NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                [NSNumber numberWithInt: kAudioFormatMPEG4AAC], AVFormatIDKey,
                                [NSNumber numberWithFloat:44100.0], AVSampleRateKey,
                                [NSNumber numberWithInt:2], AVNumberOfChannelsKey,
                                [NSNumber numberWithInt:128000], AVEncoderBitRateKey,
                                [NSData data], AVChannelLayoutKey,
                                nil];

AVAssetWriterInput *writerInput = [[AVAssetWriterInput alloc] initWithMediaType:AVMediaTypeAudio
                                                                 outputSettings:outputSettings];

NSString *exportPath = [NSTemporaryDirectory() stringByAppendingPathComponent:@"out.m4a"];

NSURL *exportURL = [NSURL fileURLWithPath:exportPath];
NSError *writerError = nil;
AVAssetWriter *writer = [[AVAssetWriter alloc] initWithURL:exportURL
                                                  fileType:AVFileTypeAppleM4A
                                                     error:&writerError];
[writerInput setExpectsMediaDataInRealTime:NO];
[writer addInput:writerInput];
[writer startWriting];
[writer startSessionAtSourceTime:kCMTimeZero];

CMSampleBufferRef sample = [readerOutput copyNextSampleBuffer];
NSMutableArray *samples = [[NSMutableArray alloc] init];

while (sample != NULL) {

    sample = [readerOutput copyNextSampleBuffer];

    if (sample == NULL)
        continue;

    [samples addObject:(__bridge id)(sample)];
    CFRelease(sample);
}

NSArray* reversedSamples = [[samples reverseObjectEnumerator] allObjects];

for (id reversedSample in reversedSamples) {
    if (writerInput.readyForMoreMediaData)  {
        [writerInput appendSampleBuffer:(__bridge CMSampleBufferRef)(reversedSample)];
    }
    else {
        [NSThread sleepForTimeInterval:0.05];
    }
}

[writerInput markAsFinished];
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0);
dispatch_async(queue, ^{
    [writer finishWriting];
});
}

更新:

如果我直接在第一个 while 循环中编写示例 - 一切正常(即使有 writerInput.readyForMoreMediaData 检查)。在这种情况下,结果文件的持续时间与原始文件的持续时间完全相同。但是,如果我从反向 NSArray 中写入相同的样本 - 结果会更短。

以样本数打印出每个缓冲区的大小(通过 "reading" readerOuput while 循环),并在 "writing" writerInput for-loop 中重复。这样您就可以看到所有缓冲区大小并查看它们是否相加。

例如,你是否遗漏或跳过一个缓冲区if (writerInput.readyForMoreMediaData)是错误的,你"sleep",但随后继续进行reversedSamples中的下一个reversedSample(该缓冲区有效地从 writerInput 中删除)

更新(基于评论): 我在代码中发现,有两个问题:

  1. 输出设置不正确(输入文件为mono1声道),但输出设置配置为2个通道。应该是:[NSNumber numberWithInt:1], AVNumberOfChannelsKey。查看输出和输入文件的信息:

  1. 第二个问题是您正在反转 8192 个音频样本的 643 个缓冲区,而不是反转每个音频样本的索引。为了查看每个缓冲区,我将您的调试从查看每个样本的大小更改为查看缓冲区的大小,即 8192。所以第 76 行现在是:size_t sampleSize = CMSampleBufferGetNumSamples(sample);

输出如下:

2015-03-19 22:26:28.171 audioReverse[25012:4901250] Reading [0]: 8192
2015-03-19 22:26:28.172 audioReverse[25012:4901250] Reading [1]: 8192
...
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [640]: 8192
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [641]: 8192
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [642]: 5056


2015-03-19 22:26:28.651 audioReverse[25012:4901250] Writing [0]: 5056
2015-03-19 22:26:28.652 audioReverse[25012:4901250] Writing [1]: 8192
...
2015-03-19 22:26:29.134 audioReverse[25012:4901250] Writing [640]: 8192
2015-03-19 22:26:29.135 audioReverse[25012:4901250] Writing [641]: 8192
2015-03-19 22:26:29.135 audioReverse[25012:4901250] Writing [642]: 8192

这表明您正在颠倒每个 8192 个样本缓冲区的顺序,但在每个缓冲区中,音频仍然是 "facing forward"。我们可以在这个屏幕截图中看到这一点,我拍摄了一个正确的反转(sample-by-sample)与你的缓冲区反转:

我认为如果你还反转每个 8192 缓冲区的每个样本,你当前的方案就可以工作。我个人不建议对 signal-processing 使用 NSArray 枚举器,但如果您在 sample-level.

操作,它可以工作

这里描述的方法是在这个link的一个Xcode项目中实现的(multi-platform SwiftUI app):

ReverseAudio Xcode Project

以相反的顺序写入音频样本是不够的。样本数据需要自己反推

在 Swift 中,我们创建了一个 AVAsset 扩展。

样本必须作为解压样本进行处理。为此,使用 kAudioFormatLinearPCM:

创建音频 reader 设置
let kAudioReaderSettings = [
    AVFormatIDKey: Int(kAudioFormatLinearPCM) as AnyObject,
    AVLinearPCMBitDepthKey: 16 as AnyObject,
    AVLinearPCMIsBigEndianKey: false as AnyObject,
    AVLinearPCMIsFloatKey: false as AnyObject,
    AVLinearPCMIsNonInterleaved: false as AnyObject]

使用我们的 AVAsset 扩展方法 audioReader:

func audioReader(outputSettings: [String : Any]?) -> (audioTrack:AVAssetTrack?, audioReader:AVAssetReader?, audioReaderOutput:AVAssetReaderTrackOutput?) {
    
    if let audioTrack = self.tracks(withMediaType: .audio).first {
        if let audioReader = try? AVAssetReader(asset: self)  {
            let audioReaderOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: outputSettings)
            return (audioTrack, audioReader, audioReaderOutput)
        }
    }
    
    return (nil, nil, nil)
}

let (_, audioReader, audioReaderOutput) = self.audioReader(outputSettings: kAudioReaderSettings)

创建 audioReader (AVAssetReader) 和 audioReaderOutput (AVAssetReaderTrackOutput) 来读取音频样本。

我们需要跟踪音频样本:

var audioSamples:[CMSampleBuffer] = []

现在开始阅读样本。

if audioReader.startReading() {
    while audioReader.status == .reading {
        if let sampleBuffer = audioReaderOutput.copyNextSampleBuffer(){ 
           // process sample                                       
        }
    }
}

保存音频样本缓冲区,我们稍后在创建反向样本时需要它:

audioSamples.append(sampleBuffer)

我们需要一个 AVAssetWriter:

guard let assetWriter = try? AVAssetWriter(outputURL: destinationURL, fileType: AVFileType.wav) else {
    // error handling
    return
}

文件类型为'wav',因为反转样本将被写入未压缩的音频格式Linear PCM,如下。

对于 assetWriter,我们指定了音频压缩设置和“源格式提示”,并且可以从未压缩的样本缓冲区中获取:

let sampleBuffer = audioSamples[0]
let sourceFormat = CMSampleBufferGetFormatDescription(sampleBuffer)

let audioCompressionSettings = [AVFormatIDKey: kAudioFormatLinearPCM] as [String : Any]

现在我们可以创建 AVAssetWriterInput,将其添加到 writer 并开始写入:

let assetWriterInput = AVAssetWriterInput(mediaType: AVMediaType.audio, outputSettings:audioCompressionSettings, sourceFormatHint: sourceFormat)

assetWriter.add(assetWriterInput)

assetWriter.startWriting()
assetWriter.startSession(atSourceTime: CMTime.zero)

现在以相反的顺序遍历样本,并为每个样本本身反转样本。

我们有一个名为“反向”的 CMSampleBuffer 扩展。

使用 requestMediaDataWhenReady 我们按如下方式执行此操作:

let nbrSamples = audioSamples.count
var index = 0

let serialQueue: DispatchQueue = DispatchQueue(label: "com.limit-point.reverse-audio-queue")
    
assetWriterInput.requestMediaDataWhenReady(on: serialQueue) {
        
    while assetWriterInput.isReadyForMoreMediaData, index < nbrSamples {
        let sampleBuffer = audioSamples[nbrSamples - 1 - index]
            
        if let reversedBuffer = sampleBuffer.reverse(), assetWriterInput.append(reversedBuffer) == true {
            index += 1
        }
        else {
            index = nbrSamples
        }
            
        if index == nbrSamples {
            assetWriterInput.markAsFinished()
            
            finishWriting() // call assetWriter.finishWriting, check assetWriter status, etc.
        }
    }
}

所以最后要解释的是如何在'reverse'方法中反转音频样本?

我们创建了 CMSampleBuffer 的扩展,它采用样本缓冲区和 returns 反向样本缓冲区,作为 CMSampleBuffer 的扩展:

func reverse() -> CMSampleBuffer? 

需要反推的数据需要使用以下方法获取:

CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer

CMSampleBuffer header 文件对该方法的描述如下:

“创建一个包含来自 CMSampleBuffer 的数据的 AudioBufferList,以及一个引用(并管理其生命周期)该 AudioBufferList 中的数据的 CMBlockBuffer。”

按如下方式调用它,其中“self”指的是我们正在反转的 CMSampleBuffer,因为这是一个扩展:

var blockBuffer: CMBlockBuffer? = nil
let audioBufferList: UnsafeMutableAudioBufferListPointer = AudioBufferList.allocate(maximumBuffers: 1)

CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
    self,
    bufferListSizeNeededOut: nil,
    bufferListOut: audioBufferList.unsafeMutablePointer,
    bufferListSize: AudioBufferList.sizeInBytes(maximumBuffers: 1),
    blockBufferAllocator: nil,
    blockBufferMemoryAllocator: nil,
    flags: kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
    blockBufferOut: &blockBuffer
 )

现在您可以访问原始数据:

let data: UnsafeMutableRawPointer = audioBufferList.unsafePointer.pointee.mBuffers.mData

反转数据我们需要将数据作为称为 sampleArray 的“样本”数组访问,并按如下方式在 Swift:

中完成
let samples = data.assumingMemoryBound(to: Int16.self)
        
let sizeofInt16 = MemoryLayout<Int16>.size
let dataSize = audioBufferList.unsafePointer.pointee.mBuffers.mDataByteSize  

let dataCount = Int(dataSize) / sizeofInt16
        
var sampleArray = Array(UnsafeBufferPointer(start: samples, count: dataCount)) as [Int16]

现在反转数组sampleArray:

sampleArray.reverse()

使用反向样本,我们创建了一个包含反向样本的新 CMSampleBuffer。

现在我们将之前获取的CMBlockBuffer中的数据替换为CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer:

首先使用反向数组重新分配“samples”:

var status:OSStatus = noErr
        
sampleArray.withUnsafeBytes { sampleArrayPtr in
    if let baseAddress = sampleArrayPtr.baseAddress {
        let bufferPointer: UnsafePointer<Int16> = baseAddress.assumingMemoryBound(to: Int16.self)
        let rawPtr = UnsafeRawPointer(bufferPointer)
                
        status = CMBlockBufferReplaceDataBytes(with: rawPtr, blockBuffer: blockBuffer!, offsetIntoDestination: 0, dataLength: Int(dataSize))
    } 
}

if status != noErr {
    return nil
}

最后使用 CMSampleBufferCreate 创建新的样本缓冲区。这个函数需要两个我们可以从原始样本缓冲区中获取的参数,即 formatDescription 和 numberOfSamples:

let formatDescription = CMSampleBufferGetFormatDescription(self)   
let numberOfSamples = CMSampleBufferGetNumSamples(self)
        
var newBuffer:CMSampleBuffer?
        

现在用反向块缓冲区创建新的样本缓冲区:

guard CMSampleBufferCreate(allocator: kCFAllocatorDefault, dataBuffer: blockBuffer, dataReady: true, makeDataReadyCallback: nil, refcon: nil, formatDescription: formatDescription, sampleCount: numberOfSamples, sampleTimingEntryCount: 0, sampleTimingArray: nil, sampleSizeEntryCount: 0, sampleSizeArray: nil, sampleBufferOut: &newBuffer) == noErr else {
    return self
}
        
return newBuffer

仅此而已!

最后一点,Core Audio 和 AVFoundation header 提供了很多有用的信息,例如 CoreAudioTypes.h、CMSampleBuffer.h 等等。

使用Swift 5 将视频和音频反向到同一资产输出的完整示例,使用上述建议处理音频:

 private func reverseVideo(inURL: URL, outURL: URL, queue: DispatchQueue, _ completionBlock: ((Bool)->Void)?) {
    Log.info("Start reverse video!")
    let asset = AVAsset.init(url: inURL)
    guard
        let reader = try? AVAssetReader.init(asset: asset),
        let videoTrack = asset.tracks(withMediaType: .video).first,
        let audioTrack = asset.tracks(withMediaType: .audio).first

        else {
            assert(false)
            completionBlock?(false)
            return
    }

    let width = videoTrack.naturalSize.width
    let height = videoTrack.naturalSize.height

    // Video reader
    let readerVideoSettings: [String : Any] = [ String(kCVPixelBufferPixelFormatTypeKey) : kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,]
    let readerVideoOutput = AVAssetReaderTrackOutput.init(track: videoTrack, outputSettings: readerVideoSettings)
    reader.add(readerVideoOutput)

    // Audio reader
    let readerAudioSettings: [String : Any] = [
        AVFormatIDKey: kAudioFormatLinearPCM,
        AVLinearPCMBitDepthKey: 16 ,
        AVLinearPCMIsBigEndianKey: false ,
        AVLinearPCMIsFloatKey: false,]
    let readerAudioOutput = AVAssetReaderTrackOutput.init(track: audioTrack, outputSettings: readerAudioSettings)
    reader.add(readerAudioOutput)

    //Start reading content
    reader.startReading()

    //Reading video samples
    var videoBuffers = [CMSampleBuffer]()
    while let nextBuffer = readerVideoOutput.copyNextSampleBuffer() {
        videoBuffers.append(nextBuffer)
    }

    //Reading audio samples
    var audioBuffers = [CMSampleBuffer]()
    var timingInfos = [CMSampleTimingInfo]()
    while let nextBuffer = readerAudioOutput.copyNextSampleBuffer() {

        var timingInfo = CMSampleTimingInfo()
        var timingInfoCount = CMItemCount()
        CMSampleBufferGetSampleTimingInfoArray(nextBuffer, entryCount: 0, arrayToFill: &timingInfo, entriesNeededOut: &timingInfoCount)

        let duration = CMSampleBufferGetDuration(nextBuffer)
        let endTime = CMTimeAdd(timingInfo.presentationTimeStamp, duration)
        let newPresentationTime = CMTimeSubtract(duration, endTime)

        timingInfo.presentationTimeStamp = newPresentationTime

        timingInfos.append(timingInfo)
        audioBuffers.append(nextBuffer)
    }

    //Stop reading
    let status = reader.status
    reader.cancelReading()
    guard status == .completed, let firstVideoBuffer = videoBuffers.first, let firstAudioBuffer = audioBuffers.first else {
        assert(false)
        completionBlock?(false)
        return
    }

    //Start video time
    let sessionStartTime = CMSampleBufferGetPresentationTimeStamp(firstVideoBuffer)

    //Writer for video
    let writerVideoSettings: [String:Any] = [
        AVVideoCodecKey : AVVideoCodecType.h264,
        AVVideoWidthKey : width,
        AVVideoHeightKey: height,
    ]
    let writerVideoInput: AVAssetWriterInput
    if let formatDescription = videoTrack.formatDescriptions.last {
        writerVideoInput = AVAssetWriterInput.init(mediaType: .video, outputSettings: writerVideoSettings, sourceFormatHint: (formatDescription as! CMFormatDescription))
    } else {
        writerVideoInput = AVAssetWriterInput.init(mediaType: .video, outputSettings: writerVideoSettings)
    }
    writerVideoInput.transform = videoTrack.preferredTransform
    writerVideoInput.expectsMediaDataInRealTime = false

    //Writer for audio
    let writerAudioSettings: [String:Any] = [
        AVFormatIDKey : kAudioFormatMPEG4AAC,
        AVSampleRateKey : 44100,
        AVNumberOfChannelsKey: 2,
        AVEncoderBitRateKey:128000,
        AVChannelLayoutKey: NSData(),
    ]
    let sourceFormat = CMSampleBufferGetFormatDescription(firstAudioBuffer)
    let writerAudioInput: AVAssetWriterInput = AVAssetWriterInput.init(mediaType: .audio, outputSettings: writerAudioSettings, sourceFormatHint: sourceFormat)
    writerAudioInput.expectsMediaDataInRealTime = true

    guard
        let writer = try? AVAssetWriter.init(url: outURL, fileType: .mp4),
        writer.canAdd(writerVideoInput),
        writer.canAdd(writerAudioInput)
        else {
            assert(false)
            completionBlock?(false)
            return
    }

    let pixelBufferAdaptor = AVAssetWriterInputPixelBufferAdaptor.init(assetWriterInput: writerVideoInput, sourcePixelBufferAttributes: nil)
    let group = DispatchGroup.init()

    group.enter()
    writer.add(writerVideoInput)
    writer.add(writerAudioInput)
    writer.startWriting()
    writer.startSession(atSourceTime: sessionStartTime)

    var videoFinished = false
    var audioFinished = false

    //Write video samples in reverse order
    var currentSample = 0
    writerVideoInput.requestMediaDataWhenReady(on: queue) {
        for i in currentSample..<videoBuffers.count {
            currentSample = i
            if !writerVideoInput.isReadyForMoreMediaData {
                return
            }
            let presentationTime = CMSampleBufferGetPresentationTimeStamp(videoBuffers[i])
            guard let imageBuffer = CMSampleBufferGetImageBuffer(videoBuffers[videoBuffers.count - i - 1]) else {
                Log.info("VideoWriter reverseVideo: warning, could not get imageBuffer from SampleBuffer...")
                continue
            }
            if !pixelBufferAdaptor.append(imageBuffer, withPresentationTime: presentationTime) {
                Log.info("VideoWriter reverseVideo: warning, could not append imageBuffer...")
            }
        }

        // finish write video samples
        writerVideoInput.markAsFinished()
        Log.info("Video writing finished!")
        videoFinished = true
        if(audioFinished){
            group.leave()
        }
    }
    //Write audio samples in reverse order
    let totalAudioSamples = audioBuffers.count
    writerAudioInput.requestMediaDataWhenReady(on: queue) {
        for i in 0..<totalAudioSamples-1 {
            if !writerAudioInput.isReadyForMoreMediaData {
                return
            }
            let audioSample = audioBuffers[totalAudioSamples-1-i]
            let timingInfo = timingInfos[i]
            // reverse samples data using timing info
            if let reversedBuffer = audioSample.reverse(timingInfo: [timingInfo]) {
                // append data
                if writerAudioInput.append(reversedBuffer) == false {
                    break
                }
            }
        }

        // finish
        writerAudioInput.markAsFinished()
        Log.info("Audio writing finished!")
        audioFinished = true
        if(videoFinished){
            group.leave()
        }
    }

    group.notify(queue: queue) {
        writer.finishWriting {
            if writer.status != .completed {
                Log.info("VideoWriter reverse video: error - \(String(describing: writer.error))")
                completionBlock?(false)
            } else {
                Log.info("Ended reverse video!")
                completionBlock?(true)
            }
        }
    }
}

编码愉快!

extension CMSampleBuffer {

func reverse(timingInfo:[CMSampleTimingInfo]) -> CMSampleBuffer? {
    var blockBuffer: CMBlockBuffer? = nil
    let audioBufferList: UnsafeMutableAudioBufferListPointer = AudioBufferList.allocate(maximumBuffers: 1)

    CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
        self,
        bufferListSizeNeededOut: nil,
        bufferListOut: audioBufferList.unsafeMutablePointer,
        bufferListSize: AudioBufferList.sizeInBytes(maximumBuffers: 1),
        blockBufferAllocator: nil,
        blockBufferMemoryAllocator: nil,
        flags: kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
        blockBufferOut: &blockBuffer
     )
    
    if let data = audioBufferList.unsafePointer.pointee.mBuffers.mData {
    
        let samples = data.assumingMemoryBound(to: Int16.self)

        let sizeofInt16 = MemoryLayout<Int16>.size
        let dataSize = audioBufferList.unsafePointer.pointee.mBuffers.mDataByteSize

        let dataCount = Int(dataSize) / sizeofInt16

        var sampleArray = Array(UnsafeBufferPointer(start: samples, count: dataCount)) as [Int16]
        
        sampleArray.reverse()
        
        var status:OSStatus = noErr
                
        sampleArray.withUnsafeBytes { sampleArrayPtr in
            if let baseAddress = sampleArrayPtr.baseAddress {
                let bufferPointer: UnsafePointer<Int16> = baseAddress.assumingMemoryBound(to: Int16.self)
                let rawPtr = UnsafeRawPointer(bufferPointer)
                        
                status = CMBlockBufferReplaceDataBytes(with: rawPtr, blockBuffer: blockBuffer!, offsetIntoDestination: 0, dataLength: Int(dataSize))
            }
        }

        if status != noErr {
            return nil
        }
        
        let formatDescription = CMSampleBufferGetFormatDescription(self)
        let numberOfSamples = CMSampleBufferGetNumSamples(self)

        var newBuffer:CMSampleBuffer?
        
        guard CMSampleBufferCreate(allocator: kCFAllocatorDefault, dataBuffer: blockBuffer, dataReady: true, makeDataReadyCallback: nil, refcon: nil, formatDescription: formatDescription, sampleCount: numberOfSamples, sampleTimingEntryCount: timingInfo.count, sampleTimingArray: timingInfo, sampleSizeEntryCount: 0, sampleSizeArray: nil, sampleBufferOut: &newBuffer) == noErr else {
            return self
        }

        return newBuffer
    }
    return nil
}
}

功能缺失!