iOS 上的低延迟音频输出问题(又名如何击败 AUAudioUnit sampleRate、maximumFramesToRender 和 ioBufferDuration 提交)
Low latency audio output problems on iOS (aka How to beat AUAudioUnit sampleRate, maximumFramesToRender, and ioBufferDuration into submission)
好吧,我显然遗漏了一些重要的部分。我正在尝试通过网络进行低延迟音频,我的基本帧是 10 毫秒。我希望这没有问题。我的目标 phone 是 iPhone X 扬声器——所以我的硬件采样率应该锁定在 48000Hz。我要求 10 毫秒,这是一个很好的偶数,应该是 480、960、1920 或 3840,具体取决于您要如何切片 frames/samples/bytes。
然而,在我的一生中,我绝对无法 iOS 做任何我认为理智的事情。我得到 10.667 毫秒的缓冲持续时间,这很可笑——iOS 正在竭尽全力为我提供不是采样率整数倍的缓冲区大小。更糟糕的是,帧看起来 LONG 这意味着我必须吸收不是一个而是 两个 延迟包才能填充它缓冲。我根本无法更改 maximumFrameToRender,并且系统返回 0 作为我的采样率,即使它很明显是以 48000Hz 渲染。
我显然遗漏了一些重要的东西——它是什么?我是否忘记了 disconnect/connect 某些东西以获得直接的硬件映射? (我的格式是 1,其中 pcmFormatFloat32——我希望 pcmFormatInt16 或 pcmFormatInt32 直接映射到硬件,所以 OS 中的某些东西可能会妨碍)感谢您的指点,我很乐意阅读更多内容。还是 AUAudioUnit 只是半生不熟,我需要回过头来使用更旧、更有用的 API?还是我完全错过了情节和低延迟音频人员使用了一套完全不同的音频管理功能?
感谢您的帮助——非常感谢。
代码输出:
2019-11-07 23:28:29.782786-0800 latencytest[3770:50382] Ready to receive user events
2019-11-07 23:28:34.727478-0800 latencytest[3770:50382] Start button pressed
2019-11-07 23:28:34.727745-0800 latencytest[3770:50382] Launching auxiliary thread
2019-11-07 23:28:34.729278-0800 latencytest[3770:50445] Thread main started
2019-11-07 23:28:35.006005-0800 latencytest[3770:50445] Sample rate: 0
2019-11-07 23:28:35.016935-0800 latencytest[3770:50445] Buffer duration: 0.010667
2019-11-07 23:28:35.016970-0800 latencytest[3770:50445] Number of output busses: 2
2019-11-07 23:28:35.016989-0800 latencytest[3770:50445] Max frames: 4096
2019-11-07 23:28:35.017010-0800 latencytest[3770:50445] Can perform output: 1
2019-11-07 23:28:35.017023-0800 latencytest[3770:50445] Output Enabled: 1
2019-11-07 23:28:35.017743-0800 latencytest[3770:50445] Bus channels: 2
2019-11-07 23:28:35.017864-0800 latencytest[3770:50445] Bus format: 1
2019-11-07 23:28:35.017962-0800 latencytest[3770:50445] Bus rate: 0
2019-11-07 23:28:35.018039-0800 latencytest[3770:50445] Sleeping 0
2019-11-07 23:28:35.018056-0800 latencytest[3770:50445] Buffer count: 2 4096
2019-11-07 23:28:36.023220-0800 latencytest[3770:50445] Sleeping 1
2019-11-07 23:28:36.023400-0800 latencytest[3770:50445] Buffer count: 190 389120
2019-11-07 23:28:37.028610-0800 latencytest[3770:50445] Sleeping 2
2019-11-07 23:28:37.028790-0800 latencytest[3770:50445] Buffer count: 378 774144
2019-11-07 23:28:38.033983-0800 latencytest[3770:50445] Sleeping 3
2019-11-07 23:28:38.034142-0800 latencytest[3770:50445] Buffer count: 566 1159168
2019-11-07 23:28:39.039333-0800 latencytest[3770:50445] Sleeping 4
2019-11-07 23:28:39.039534-0800 latencytest[3770:50445] Buffer count: 756 1548288
2019-11-07 23:28:40.041787-0800 latencytest[3770:50445] Sleeping 5
2019-11-07 23:28:40.041943-0800 latencytest[3770:50445] Buffer count: 944 1933312
2019-11-07 23:28:41.042878-0800 latencytest[3770:50445] Sleeping 6
2019-11-07 23:28:41.043037-0800 latencytest[3770:50445] Buffer count: 1132 2318336
2019-11-07 23:28:42.048219-0800 latencytest[3770:50445] Sleeping 7
2019-11-07 23:28:42.048375-0800 latencytest[3770:50445] Buffer count: 1320 2703360
2019-11-07 23:28:43.053613-0800 latencytest[3770:50445] Sleeping 8
2019-11-07 23:28:43.053771-0800 latencytest[3770:50445] Buffer count: 1508 3088384
2019-11-07 23:28:44.058961-0800 latencytest[3770:50445] Sleeping 9
2019-11-07 23:28:44.059119-0800 latencytest[3770:50445] Buffer count: 1696 3473408
实际代码:
import UIKit
import os.log
import Foundation
import AudioToolbox
import AVFoundation
class AuxiliaryWork: Thread {
let II_SAMPLE_RATE = 48000
var iiStopRequested: Int32 = 0; // Int32 is normally guaranteed to be atomic on most architectures
var iiBufferFillCount: Int32 = 0;
var iiBufferByteCount: Int32 = 0;
func requestStop() {
iiStopRequested = 1;
}
func myAVAudioSessionInterruptionNotificationHandler(notification: Notification ) -> Void {
os_log(OSLogType.info, "AVAudioSession Interrupted: %s", notification.debugDescription)
}
func myAudioUnitProvider(actionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>, timestamp: UnsafePointer<AudioTimeStamp>,
frameCount: AUAudioFrameCount, inputBusNumber: Int, inputData: UnsafeMutablePointer<AudioBufferList>) -> AUAudioUnitStatus {
let ppInputData = UnsafeMutableAudioBufferListPointer(inputData)
let iiNumBuffers = ppInputData.count
if (iiNumBuffers > 0) {
assert(iiNumBuffers == 2)
for bbBuffer in ppInputData {
assert(Int(bbBuffer.mDataByteSize) == 2048) // FIXME: This should be 960 or 1920 ...
iiBufferFillCount += 1
iiBufferByteCount += Int32(bbBuffer.mDataByteSize)
memset(bbBuffer.mData, 0, Int(bbBuffer.mDataByteSize)) // Just send silence
}
} else {
os_log(OSLogType.error, "Zero buffers from system")
assert(iiNumBuffers != 0) // Force crash since os_log would cause an audio hiccup due to locks anyway
}
return noErr
}
override func main() {
os_log(OSLogType.info, "Thread main started")
#if os(iOS)
let kOutputUnitSubType = kAudioUnitSubType_RemoteIO
#else
let kOutputUnitSubType = kAudioUnitSubtype_HALOutput
#endif
let audioSession = AVAudioSession.sharedInstance() // FIXME: Causes the following message No Factory registered for id
try! audioSession.setCategory(AVAudioSession.Category.playback, options: [])
try! audioSession.setMode(AVAudioSession.Mode.measurement)
try! audioSession.setPreferredSampleRate(48000.0)
try! audioSession.setPreferredIOBufferDuration(0.010)
NotificationCenter.default.addObserver(
forName: AVAudioSession.interruptionNotification,
object: nil,
queue: nil,
using: myAVAudioSessionInterruptionNotificationHandler
)
let ioUnitDesc = AudioComponentDescription(
componentType: kAudioUnitType_Output,
componentSubType: kOutputUnitSubType,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0)
let auUnit = try! AUAudioUnit(componentDescription: ioUnitDesc,
options: AudioComponentInstantiationOptions())
auUnit.outputProvider = myAudioUnitProvider;
auUnit.maximumFramesToRender = 256
try! audioSession.setActive(true)
try! auUnit.allocateRenderResources() // Make sure audio unit has hardware resources--we could provide the buffers from the circular buffer if we want
try! auUnit.startHardware()
os_log(OSLogType.info, "Sample rate: %d", audioSession.sampleRate);
os_log(OSLogType.info, "Buffer duration: %f", audioSession.ioBufferDuration);
os_log(OSLogType.info, "Number of output busses: %d", auUnit.outputBusses.count);
os_log(OSLogType.info, "Max frames: %d", auUnit.maximumFramesToRender);
os_log(OSLogType.info, "Can perform output: %d", auUnit.canPerformOutput)
os_log(OSLogType.info, "Output Enabled: %d", auUnit.isOutputEnabled)
//os_log(OSLogType.info, "Audio Format: %p", audioFormat)
var bus0 = auUnit.outputBusses[0]
os_log(OSLogType.info, "Bus channels: %d", bus0.format.channelCount)
os_log(OSLogType.info, "Bus format: %d", bus0.format.commonFormat.rawValue)
os_log(OSLogType.info, "Bus rate: %d", bus0.format.sampleRate)
for ii in 0..<10 {
if (iiStopRequested != 0) {
os_log(OSLogType.info, "Manual stop requested");
break;
}
os_log(OSLogType.info, "Sleeping %d", ii);
os_log(OSLogType.info, "Buffer count: %d %d", iiBufferFillCount, iiBufferByteCount)
Thread.sleep(forTimeInterval: 1.0);
}
auUnit.stopHardware()
}
}
class FirstViewController: UIViewController {
var thrAuxiliaryWork: AuxiliaryWork? = nil;
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
}
@IBAction func startButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Start button pressed");
os_log(OSLogType.error, "Launching auxiliary thread");
thrAuxiliaryWork = AuxiliaryWork();
thrAuxiliaryWork?.start();
}
@IBAction func stopButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Stop button pressed");
os_log(OSLogType.error, "Manually stopping auxiliary thread");
thrAuxiliaryWork?.requestStop();
}
@IBAction func muteButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Mute button pressed");
}
@IBAction func unmuteButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Unmute button pressed");
}
}
您无法假设 API 会为您完成,从而击败 iOS 硅硬件。如果你想抽象硬件,你必须自己做缓冲。
为了获得最好(最低)的延迟,您的软件必须(可能动态地)适应实际的硬件功能,这可能因设备和模式而异。
硬件采样率似乎是 44.1ksps(较旧的 iOS 设备)、48ksps(较新的 arm64 iOS 设备)或其整数倍(以及插入时可能的其他速率非 AirPod 蓝牙耳机或外部 ADC)。实际的硬件 DMA(或等效)缓冲区的大小似乎总是 2 的幂,在最新设备上可能低至 64 个样本。然而,各种 iOS 省电模式会将缓冲区大小(2 的幂)增加到 4k 样本,尤其是在较旧的 iOS 设备上。如果您请求硬件速率以外的采样率,OS 可能会将缓冲区重新采样为不同于 2 的幂的大小,并且如果重新采样率不高,则此大小可以从音频单元回调更改为后续回调t 一个精确的整数。
音频单元是可通过 iOS 设备上的 public API 访问的最低级别。其他一切都建立在上面,因此可能会导致更大的延迟。例如,如果您使用具有非硬件缓冲区大小的音频队列 API,则 OS 将在内部使用 2 的幂次方音频缓冲区来访问硬件,并将它们切碎或部分连接起来到 return 或获取非硬件大小的音频队列缓冲区。更慢和紧张。
很长一段时间以来,iOS API 是唯一可以在手机和平板电脑上使用的 API 实时低延迟音乐表演。但是通过开发与硬件匹配的软件。
好吧,我显然遗漏了一些重要的部分。我正在尝试通过网络进行低延迟音频,我的基本帧是 10 毫秒。我希望这没有问题。我的目标 phone 是 iPhone X 扬声器——所以我的硬件采样率应该锁定在 48000Hz。我要求 10 毫秒,这是一个很好的偶数,应该是 480、960、1920 或 3840,具体取决于您要如何切片 frames/samples/bytes。
然而,在我的一生中,我绝对无法 iOS 做任何我认为理智的事情。我得到 10.667 毫秒的缓冲持续时间,这很可笑——iOS 正在竭尽全力为我提供不是采样率整数倍的缓冲区大小。更糟糕的是,帧看起来 LONG 这意味着我必须吸收不是一个而是 两个 延迟包才能填充它缓冲。我根本无法更改 maximumFrameToRender,并且系统返回 0 作为我的采样率,即使它很明显是以 48000Hz 渲染。
我显然遗漏了一些重要的东西——它是什么?我是否忘记了 disconnect/connect 某些东西以获得直接的硬件映射? (我的格式是 1,其中 pcmFormatFloat32——我希望 pcmFormatInt16 或 pcmFormatInt32 直接映射到硬件,所以 OS 中的某些东西可能会妨碍)感谢您的指点,我很乐意阅读更多内容。还是 AUAudioUnit 只是半生不熟,我需要回过头来使用更旧、更有用的 API?还是我完全错过了情节和低延迟音频人员使用了一套完全不同的音频管理功能?
感谢您的帮助——非常感谢。
代码输出:
2019-11-07 23:28:29.782786-0800 latencytest[3770:50382] Ready to receive user events
2019-11-07 23:28:34.727478-0800 latencytest[3770:50382] Start button pressed
2019-11-07 23:28:34.727745-0800 latencytest[3770:50382] Launching auxiliary thread
2019-11-07 23:28:34.729278-0800 latencytest[3770:50445] Thread main started
2019-11-07 23:28:35.006005-0800 latencytest[3770:50445] Sample rate: 0
2019-11-07 23:28:35.016935-0800 latencytest[3770:50445] Buffer duration: 0.010667
2019-11-07 23:28:35.016970-0800 latencytest[3770:50445] Number of output busses: 2
2019-11-07 23:28:35.016989-0800 latencytest[3770:50445] Max frames: 4096
2019-11-07 23:28:35.017010-0800 latencytest[3770:50445] Can perform output: 1
2019-11-07 23:28:35.017023-0800 latencytest[3770:50445] Output Enabled: 1
2019-11-07 23:28:35.017743-0800 latencytest[3770:50445] Bus channels: 2
2019-11-07 23:28:35.017864-0800 latencytest[3770:50445] Bus format: 1
2019-11-07 23:28:35.017962-0800 latencytest[3770:50445] Bus rate: 0
2019-11-07 23:28:35.018039-0800 latencytest[3770:50445] Sleeping 0
2019-11-07 23:28:35.018056-0800 latencytest[3770:50445] Buffer count: 2 4096
2019-11-07 23:28:36.023220-0800 latencytest[3770:50445] Sleeping 1
2019-11-07 23:28:36.023400-0800 latencytest[3770:50445] Buffer count: 190 389120
2019-11-07 23:28:37.028610-0800 latencytest[3770:50445] Sleeping 2
2019-11-07 23:28:37.028790-0800 latencytest[3770:50445] Buffer count: 378 774144
2019-11-07 23:28:38.033983-0800 latencytest[3770:50445] Sleeping 3
2019-11-07 23:28:38.034142-0800 latencytest[3770:50445] Buffer count: 566 1159168
2019-11-07 23:28:39.039333-0800 latencytest[3770:50445] Sleeping 4
2019-11-07 23:28:39.039534-0800 latencytest[3770:50445] Buffer count: 756 1548288
2019-11-07 23:28:40.041787-0800 latencytest[3770:50445] Sleeping 5
2019-11-07 23:28:40.041943-0800 latencytest[3770:50445] Buffer count: 944 1933312
2019-11-07 23:28:41.042878-0800 latencytest[3770:50445] Sleeping 6
2019-11-07 23:28:41.043037-0800 latencytest[3770:50445] Buffer count: 1132 2318336
2019-11-07 23:28:42.048219-0800 latencytest[3770:50445] Sleeping 7
2019-11-07 23:28:42.048375-0800 latencytest[3770:50445] Buffer count: 1320 2703360
2019-11-07 23:28:43.053613-0800 latencytest[3770:50445] Sleeping 8
2019-11-07 23:28:43.053771-0800 latencytest[3770:50445] Buffer count: 1508 3088384
2019-11-07 23:28:44.058961-0800 latencytest[3770:50445] Sleeping 9
2019-11-07 23:28:44.059119-0800 latencytest[3770:50445] Buffer count: 1696 3473408
实际代码:
import UIKit
import os.log
import Foundation
import AudioToolbox
import AVFoundation
class AuxiliaryWork: Thread {
let II_SAMPLE_RATE = 48000
var iiStopRequested: Int32 = 0; // Int32 is normally guaranteed to be atomic on most architectures
var iiBufferFillCount: Int32 = 0;
var iiBufferByteCount: Int32 = 0;
func requestStop() {
iiStopRequested = 1;
}
func myAVAudioSessionInterruptionNotificationHandler(notification: Notification ) -> Void {
os_log(OSLogType.info, "AVAudioSession Interrupted: %s", notification.debugDescription)
}
func myAudioUnitProvider(actionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>, timestamp: UnsafePointer<AudioTimeStamp>,
frameCount: AUAudioFrameCount, inputBusNumber: Int, inputData: UnsafeMutablePointer<AudioBufferList>) -> AUAudioUnitStatus {
let ppInputData = UnsafeMutableAudioBufferListPointer(inputData)
let iiNumBuffers = ppInputData.count
if (iiNumBuffers > 0) {
assert(iiNumBuffers == 2)
for bbBuffer in ppInputData {
assert(Int(bbBuffer.mDataByteSize) == 2048) // FIXME: This should be 960 or 1920 ...
iiBufferFillCount += 1
iiBufferByteCount += Int32(bbBuffer.mDataByteSize)
memset(bbBuffer.mData, 0, Int(bbBuffer.mDataByteSize)) // Just send silence
}
} else {
os_log(OSLogType.error, "Zero buffers from system")
assert(iiNumBuffers != 0) // Force crash since os_log would cause an audio hiccup due to locks anyway
}
return noErr
}
override func main() {
os_log(OSLogType.info, "Thread main started")
#if os(iOS)
let kOutputUnitSubType = kAudioUnitSubType_RemoteIO
#else
let kOutputUnitSubType = kAudioUnitSubtype_HALOutput
#endif
let audioSession = AVAudioSession.sharedInstance() // FIXME: Causes the following message No Factory registered for id
try! audioSession.setCategory(AVAudioSession.Category.playback, options: [])
try! audioSession.setMode(AVAudioSession.Mode.measurement)
try! audioSession.setPreferredSampleRate(48000.0)
try! audioSession.setPreferredIOBufferDuration(0.010)
NotificationCenter.default.addObserver(
forName: AVAudioSession.interruptionNotification,
object: nil,
queue: nil,
using: myAVAudioSessionInterruptionNotificationHandler
)
let ioUnitDesc = AudioComponentDescription(
componentType: kAudioUnitType_Output,
componentSubType: kOutputUnitSubType,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0)
let auUnit = try! AUAudioUnit(componentDescription: ioUnitDesc,
options: AudioComponentInstantiationOptions())
auUnit.outputProvider = myAudioUnitProvider;
auUnit.maximumFramesToRender = 256
try! audioSession.setActive(true)
try! auUnit.allocateRenderResources() // Make sure audio unit has hardware resources--we could provide the buffers from the circular buffer if we want
try! auUnit.startHardware()
os_log(OSLogType.info, "Sample rate: %d", audioSession.sampleRate);
os_log(OSLogType.info, "Buffer duration: %f", audioSession.ioBufferDuration);
os_log(OSLogType.info, "Number of output busses: %d", auUnit.outputBusses.count);
os_log(OSLogType.info, "Max frames: %d", auUnit.maximumFramesToRender);
os_log(OSLogType.info, "Can perform output: %d", auUnit.canPerformOutput)
os_log(OSLogType.info, "Output Enabled: %d", auUnit.isOutputEnabled)
//os_log(OSLogType.info, "Audio Format: %p", audioFormat)
var bus0 = auUnit.outputBusses[0]
os_log(OSLogType.info, "Bus channels: %d", bus0.format.channelCount)
os_log(OSLogType.info, "Bus format: %d", bus0.format.commonFormat.rawValue)
os_log(OSLogType.info, "Bus rate: %d", bus0.format.sampleRate)
for ii in 0..<10 {
if (iiStopRequested != 0) {
os_log(OSLogType.info, "Manual stop requested");
break;
}
os_log(OSLogType.info, "Sleeping %d", ii);
os_log(OSLogType.info, "Buffer count: %d %d", iiBufferFillCount, iiBufferByteCount)
Thread.sleep(forTimeInterval: 1.0);
}
auUnit.stopHardware()
}
}
class FirstViewController: UIViewController {
var thrAuxiliaryWork: AuxiliaryWork? = nil;
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
}
@IBAction func startButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Start button pressed");
os_log(OSLogType.error, "Launching auxiliary thread");
thrAuxiliaryWork = AuxiliaryWork();
thrAuxiliaryWork?.start();
}
@IBAction func stopButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Stop button pressed");
os_log(OSLogType.error, "Manually stopping auxiliary thread");
thrAuxiliaryWork?.requestStop();
}
@IBAction func muteButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Mute button pressed");
}
@IBAction func unmuteButtonPressed(_ sender: Any) {
os_log(OSLogType.error, "Unmute button pressed");
}
}
您无法假设 API 会为您完成,从而击败 iOS 硅硬件。如果你想抽象硬件,你必须自己做缓冲。
为了获得最好(最低)的延迟,您的软件必须(可能动态地)适应实际的硬件功能,这可能因设备和模式而异。
硬件采样率似乎是 44.1ksps(较旧的 iOS 设备)、48ksps(较新的 arm64 iOS 设备)或其整数倍(以及插入时可能的其他速率非 AirPod 蓝牙耳机或外部 ADC)。实际的硬件 DMA(或等效)缓冲区的大小似乎总是 2 的幂,在最新设备上可能低至 64 个样本。然而,各种 iOS 省电模式会将缓冲区大小(2 的幂)增加到 4k 样本,尤其是在较旧的 iOS 设备上。如果您请求硬件速率以外的采样率,OS 可能会将缓冲区重新采样为不同于 2 的幂的大小,并且如果重新采样率不高,则此大小可以从音频单元回调更改为后续回调t 一个精确的整数。
音频单元是可通过 iOS 设备上的 public API 访问的最低级别。其他一切都建立在上面,因此可能会导致更大的延迟。例如,如果您使用具有非硬件缓冲区大小的音频队列 API,则 OS 将在内部使用 2 的幂次方音频缓冲区来访问硬件,并将它们切碎或部分连接起来到 return 或获取非硬件大小的音频队列缓冲区。更慢和紧张。
很长一段时间以来,iOS API 是唯一可以在手机和平板电脑上使用的 API 实时低延迟音乐表演。但是通过开发与硬件匹配的软件。