为什么我的 Core Audio 程序会发出爆裂声?

Why do I get popping noises from my Core Audio program?

我正在尝试弄清楚如何使用 Apple 的核心音频 API 来录制和播放线性 PCM 音频而无需任何文件I/O。 (录音方面似乎工作得很好。)

我的代码很短,而且有点用。但是,我无法识别输出中的咔哒声和爆裂声的来源。我已经为此苦苦思索了好几天都没有成功。

我在这里发布了一个 git 回购协议,其中包含一个命令行程序程序,显示我所在的位置:https://github.com/maxharris9/AudioRecorderPlayerSwift/tree/main/AudioRecorderPlayerSwift

我加入了几个函数来预填充录音。音调发生器 (makeWave) 和噪声发生器 (makeNoise) 只是作为调试辅助工具放在这里。当您在 audioData:

中播放录音时,我最终会尝试确定混乱输出的来源
// makeWave(duration: 30.0, frequency: 441.0) // appends to `audioData`
// makeNoise(frameCount: Int(44100.0 * 30)) // appends to `audioData`
_ = Recorder() // appends to `audioData`
_ = Player() // reads from `audioData`

玩家代码如下:

var lastIndexRead: Int = 0

func outputCallback(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
    guard let player = inUserData?.assumingMemoryBound(to: Player.PlayingState.self) else {
        print("missing user data in output callback")
        return
    }

    let sliceStart = lastIndexRead
    let sliceEnd = min(audioData.count, lastIndexRead + bufferByteSize - 1)
    print("slice start:", sliceStart, "slice end:", sliceEnd, "audioData.count", audioData.count)

    if sliceEnd >= audioData.count {
        player.pointee.running = false
        print("found end of audio data")
        return
    }

    let slice = Array(audioData[sliceStart ..< sliceEnd])
    let sliceCount = slice.count

    // doesn't fix it
    // audioData[sliceStart ..< sliceEnd].withUnsafeBytes {
    //     inBuffer.pointee.mAudioData.copyMemory(from: [=11=].baseAddress!, byteCount: Int(sliceCount))
    // }

    memcpy(inBuffer.pointee.mAudioData, slice, sliceCount)
    inBuffer.pointee.mAudioDataByteSize = UInt32(sliceCount)
    lastIndexRead += sliceCount + 1

    // enqueue the buffer, or re-enqueue it if it's a used one
    check(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil))
}

struct Player {
    struct PlayingState {
        var packetPosition: UInt32 = 0
        var running: Bool = false
        var start: Int = 0
        var end: Int = Int(bufferByteSize)
    }

    init() {
        var playingState: PlayingState = PlayingState()
        var queue: AudioQueueRef?

        // this doesn't help
        // check(AudioQueueNewOutput(&audioFormat, outputCallback, &playingState, CFRunLoopGetMain(), CFRunLoopMode.commonModes.rawValue, 0, &queue))

        check(AudioQueueNewOutput(&audioFormat, outputCallback, &playingState, nil, nil, 0, &queue))

        var buffers: [AudioQueueBufferRef?] = Array<AudioQueueBufferRef?>.init(repeating: nil, count: BUFFER_COUNT)

        print("Playing\n")
        playingState.running = true

        for i in 0 ..< BUFFER_COUNT {
            check(AudioQueueAllocateBuffer(queue!, UInt32(bufferByteSize), &buffers[i]))
            outputCallback(inUserData: &playingState, inAQ: queue!, inBuffer: buffers[i]!)

            if !playingState.running {
                break
            }
        }

        check(AudioQueueStart(queue!, nil))

        repeat {
            CFRunLoopRunInMode(CFRunLoopMode.defaultMode, BUFFER_DURATION, false)
        } while playingState.running

        // delay to ensure queue emits all buffered audio
        CFRunLoopRunInMode(CFRunLoopMode.defaultMode, BUFFER_DURATION * Double(BUFFER_COUNT + 1), false)

        check(AudioQueueStop(queue!, true))
        check(AudioQueueDispose(queue!, true))
    }
}

我用 Audio Hijack 捕获了音频,发现跳跃确实与缓冲区的大小相关:

为什么会发生这种情况,我该如何解决?

我相信您已经开始归零,或者至少怀疑您听到的爆音的原因:它是由波形中的不连续性引起的。

我最初的预感是您正在独立生成缓冲区(即假设每个缓冲区从 time=0 开始),但我检查了您的代码并发现并非如此。我怀疑 makeWave 中的某些计算有误。为了验证这个理论,我用以下内容替换了你的 makeWave

func makeWave(offset: Double, numSamples: Int, sampleRate: Float64, frequency: Float64, numChannels: Int) -> [Int16] {
    var data = [Int16]()
    for sample in 0..<numSamples / numChannels {
        // time in s
        let t = offset + Double(sample) / sampleRate
        let value = Double(Int16.max) * sin(2 * Double.pi * frequency * t)
        for _ in 0..<numChannels {
            data.append(Int16(value))
        }
    }
    return data
}

这个函数去掉了原来的双循环,接受一个偏移量,这样它就知道生成的是波的哪一部分,并对正弦波的采样做了一些改变。

Player 被修改为使用此功能时,您会得到一个可爱的稳定音调。我会尽快将更改添加到播放器。我不能凭良心向 public.

展示它现在的快速和肮脏的混乱

根据您在下方的评论,我重新关注了您的播放器。问题是音频缓冲区需要字节计数,但切片计数和一些其他计算是基于 Int16 计数。 outputCallback 的以下版本将修复它。专注于新变量bytesPerChannel.

的使用
func outputCallback(inUserData: UnsafeMutableRawPointer?, inAQ: AudioQueueRef, inBuffer: AudioQueueBufferRef) {
    guard let player = inUserData?.assumingMemoryBound(to: Player.PlayingState.self) else {
        print("missing user data in output callback")
        return
    }

    let bytesPerChannel = MemoryLayout<Int16>.size
    let sliceStart = lastIndexRead
    let sliceEnd = min(audioData.count, lastIndexRead + bufferByteSize/bytesPerChannel)

    if sliceEnd >= audioData.count {
        player.pointee.running = false
        print("found end of audio data")
        return
    }

    let slice = Array(audioData[sliceStart ..< sliceEnd])
    let sliceCount = slice.count
    
        print("slice start:", sliceStart, "slice end:", sliceEnd, "audioData.count", audioData.count, "slice count:", sliceCount)

    // need to be careful to convert from counts of Ints to bytes
    memcpy(inBuffer.pointee.mAudioData, slice, sliceCount*bytesPerChannel)
    inBuffer.pointee.mAudioDataByteSize = UInt32(sliceCount*bytesPerChannel)
    lastIndexRead += sliceCount

    // enqueue the buffer, or re-enqueue it if it's a used one
    check(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil))
}

我没有查看 Recorder 代码,但您可能想检查一下是否出现了相同类型的错误。