在 iOS 中通过使用 AudioQueue 记录的 WebSocket 流式传输音频

Question

我正在 iOS 中制作转录应用程序。所以，我必须在缓冲区中录制音频并通过套接字将它们流式传输到服务器。所以，我已经使用 AudioQueue 来记录缓冲区中的音频。

正在本地文件中正确录制音频。对于流式传输，我将音频数据转换为 NSData 并通过套接字发送。但是，服务器中的音频质量不好，尤其是语音根本不清晰。它在声音的地方包含很多噪音。同样的逻辑在 Android 中也适用。所以，服务器端代码工作正常。但是，iOS 流式转换是个问题。我使用了两个不同的套接字 (SocketRocket/PockSocket)。两个套接字中的问题仍然相同。

我在这里附上了我的代码。如果你能帮助我，请告诉我。

ViewController.h

#import <UIKit/UIKit.h>
#import <AudioToolbox/AudioQueue.h>
#import <AudioToolbox/AudioFile.h>
#import <SocketRocket/SocketRocket.h>

#define NUM_BUFFERS 3
#define SAMPLERATE 16000

//Struct defining recording state
typedef struct {
    AudioStreamBasicDescription dataFormat;
    AudioQueueRef               queue;
    AudioQueueBufferRef         buffers[NUM_BUFFERS];
    AudioFileID                 audioFile;
    SInt64                      currentPacket;
    bool                        recording;
} RecordState;


//Struct defining playback state
typedef struct {
    AudioStreamBasicDescription dataFormat;
    AudioQueueRef               queue;
    AudioQueueBufferRef         buffers[NUM_BUFFERS];
    AudioFileID                 audioFile;
    SInt64                      currentPacket;
    bool                        playing;
} PlayState;

@interface ViewController : UIViewController <SRWebSocketDelegate> {
    RecordState recordState;
    PlayState playState;
    CFURLRef fileURL;
}

@property (nonatomic, strong) SRWebSocket * webSocket;

@property (weak, nonatomic) IBOutlet UITextView *textView;

@end

ViewController.m

#import "ViewController.h"


id thisClass;

//Declare C callback functions
void AudioInputCallback(void * inUserData,  // Custom audio metada
                        AudioQueueRef inAQ,
                        AudioQueueBufferRef inBuffer,
                        const AudioTimeStamp * inStartTime,
                        UInt32 isNumberPacketDescriptions,
                        const AudioStreamPacketDescription * inPacketDescs);

void AudioOutputCallback(void * inUserData,
                         AudioQueueRef outAQ,
                         AudioQueueBufferRef outBuffer);


@interface ViewController ()



@end

@implementation ViewController 

@synthesize webSocket;
@synthesize textView;


// Takes a filled buffer and writes it to disk, "emptying" the buffer
void AudioInputCallback(void * inUserData,
                        AudioQueueRef inAQ, 
                        AudioQueueBufferRef inBuffer,
                        const AudioTimeStamp * inStartTime,
                        UInt32 inNumberPacketDescriptions,
                        const AudioStreamPacketDescription * inPacketDescs)
{
    RecordState * recordState = (RecordState*)inUserData;
    if (!recordState->recording)
    {
        printf("Not recording, returning\n");
    }


    printf("Writing buffer %lld\n", recordState->currentPacket);
    OSStatus status = AudioFileWritePackets(recordState->audioFile,
                                            false,
                                            inBuffer->mAudioDataByteSize,
                                            inPacketDescs,
                                            recordState->currentPacket,
                                            &inNumberPacketDescriptions,
                                            inBuffer->mAudioData);



    if (status == 0)
    {
        recordState->currentPacket += inNumberPacketDescriptions;

        NSData * audioData = [NSData dataWithBytes:inBuffer->mAudioData length:inBuffer->mAudioDataByteSize * NUM_BUFFERS];
        [thisClass sendAudioToSocketAsData:audioData];

    }

    AudioQueueEnqueueBuffer(recordState->queue, inBuffer, 0, NULL);
}

// Fills an empty buffer with data and sends it to the speaker
void AudioOutputCallback(void * inUserData,
                         AudioQueueRef outAQ,
                         AudioQueueBufferRef outBuffer) {
    PlayState * playState = (PlayState *) inUserData;
    if(!playState -> playing) {
        printf("Not playing, returning\n");
        return;
    }

    printf("Queuing buffer %lld for playback\n", playState -> currentPacket);

    AudioStreamPacketDescription * packetDescs;

    UInt32 bytesRead;
    UInt32 numPackets =  SAMPLERATE * NUM_BUFFERS;
    OSStatus status;
    status = AudioFileReadPackets(playState -> audioFile, false, &bytesRead, packetDescs, playState -> currentPacket, &numPackets, outBuffer -> mAudioData);

    if (numPackets) {
        outBuffer -> mAudioDataByteSize = bytesRead;
        status = AudioQueueEnqueueBuffer(playState -> queue, outBuffer, 0, packetDescs);
        playState -> currentPacket += numPackets;
    }else {
        if (playState -> playing) {
            AudioQueueStop(playState -> queue, false);
            AudioFileClose(playState -> audioFile);
            playState -> playing = false;
        }

        AudioQueueFreeBuffer(playState -> queue, outBuffer);
    }

}

- (void) setupAudioFormat:(AudioStreamBasicDescription *) format {


    format -> mSampleRate = SAMPLERATE;
    format -> mFormatID = kAudioFormatLinearPCM;
    format -> mFramesPerPacket = 1;
    format -> mChannelsPerFrame = 1;
    format -> mBytesPerFrame = 2;
    format -> mBytesPerPacket = 2;
    format -> mBitsPerChannel = 16;
    format -> mReserved = 0;
    format -> mFormatFlags =  kLinearPCMFormatFlagIsBigEndian |kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;

}


- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view, typically from a nib.

     char path[256];
    [self getFilename:path maxLength:sizeof path];

    fileURL = CFURLCreateFromFileSystemRepresentation(NULL, (UInt8*)path, strlen(path), false);


    // Init state variables
    recordState.recording = false;
    thisClass = self;

}

- (void) startRecordingInQueue {
    [self setupAudioFormat:&recordState.dataFormat];

    recordState.currentPacket = 0;

    OSStatus status;

    status = AudioQueueNewInput(&recordState.dataFormat, AudioInputCallback, &recordState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &recordState.queue);
    if(status == 0) {
        //Prime recording buffers with empty data
        for (int i=0; i < NUM_BUFFERS; i++) {
            AudioQueueAllocateBuffer(recordState.queue, SAMPLERATE, &recordState.buffers[i]);
            AudioQueueEnqueueBuffer(recordState.queue, recordState.buffers[i], 0, NULL);
        }

        status = AudioFileCreateWithURL(fileURL, kAudioFileAIFFType, &recordState.dataFormat, kAudioFileFlags_EraseFile, &recordState.audioFile);
        if (status == 0) {
            recordState.recording = true;
            status = AudioQueueStart(recordState.queue, NULL);
            if(status == 0) {
                NSLog(@"-----------Recording--------------");
                NSLog(@"File URL : %@", fileURL);
            }
        }
    }

    if (status != 0) {
        [self stopRecordingInQueue];
    }
}

- (void) stopRecordingInQueue {
    recordState.recording = false;
    AudioQueueStop(recordState.queue, true);
    for (int i=0; i < NUM_BUFFERS; i++) {
        AudioQueueFreeBuffer(recordState.queue, recordState.buffers[i]);
    }

    AudioQueueDispose(recordState.queue, true);
    AudioFileClose(recordState.audioFile);
    NSLog(@"---Idle------");
    NSLog(@"File URL : %@", fileURL);


}

- (void) startPlaybackInQueue {
    playState.currentPacket = 0;
    [self setupAudioFormat:&playState.dataFormat];

    OSStatus status;
    status = AudioFileOpenURL(fileURL, kAudioFileReadPermission, kAudioFileAIFFType, &playState.audioFile);
    if (status == 0) {
        status = AudioQueueNewOutput(&playState.dataFormat, AudioOutputCallback, &playState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &playState.queue);
        if( status == 0) {
            //Allocate and prime playback buffers
            playState.playing = true;
            for (int i=0; i < NUM_BUFFERS && playState.playing; i++) {
                AudioQueueAllocateBuffer(playState.queue, SAMPLERATE, &playState.buffers[i]);
                AudioOutputCallback(&playState, playState.queue, playState.buffers[i]);
            }

            status = AudioQueueStart(playState.queue, NULL);
            if (status == 0) {
                NSLog(@"-------Playing Audio---------");
            }
        }
    }

    if (status != 0) {
        [self stopPlaybackInQueue];
        NSLog(@"---Playing Audio Failed ------");
    }
}

- (void) stopPlaybackInQueue {
    playState.playing = false;

    for (int i=0; i < NUM_BUFFERS; i++) {
        AudioQueueFreeBuffer(playState.queue, playState.buffers[i]);
    }

    AudioQueueDispose(playState.queue, true);
    AudioFileClose(playState.audioFile);
}

- (IBAction)startRecordingAudio:(id)sender {
    NSLog(@"starting recording tapped");
    [self startRecordingInQueue];
}
- (IBAction)stopRecordingAudio:(id)sender {
    NSLog(@"stop recording tapped");
    [self stopRecordingInQueue];
}


- (IBAction)startPlayingAudio:(id)sender {
    NSLog(@"start playing audio tapped");
    [self startPlaybackInQueue];
}

- (IBAction)stopPlayingAudio:(id)sender {
    NSLog(@"stop playing audio tapped");
    [self stopPlaybackInQueue];
}

- (BOOL) getFilename:(char *) buffer maxLength:(int) maxBufferLength {

    NSArray * paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString * docDir = [paths objectAtIndex:0];

    NSString * file = [docDir stringByAppendingString:@"recording.aif"];
    return [file getCString:buffer maxLength:maxBufferLength encoding:NSUTF8StringEncoding];

}


- (void) sendAudioToSocketAsData:(NSData *) audioData {
    [self.webSocket send:audioData];
}

- (IBAction)connectToSocketTapped:(id)sender {
    [self startStreaming];
}

- (void) startStreaming {
    [self connectToSocket];
}

- (void) connectToSocket {
    //Socket Connection Intiliazation

    // create the NSURLRequest that will be sent as the handshake
    NSURLRequest *request = [NSURLRequest requestWithURL:[NSURL URLWithString:@"${url}"]];

    // create the socket and assign delegate

    self.webSocket = [[SRWebSocket alloc] initWithURLRequest:request];

    self.webSocket.delegate = self;

    // open socket
    [self.webSocket open];

}


///--------------------------------------
#pragma mark - SRWebSocketDelegate
///--------------------------------------

- (void)webSocketDidOpen:(SRWebSocket *)webSocket;
{
    NSLog(@"Websocket Connected");

}

- (void) webSocket:(SRWebSocket *)webSocket didFailWithError:(NSError *)error {
    NSLog(@":( Websocket Failed With Error %@", error);
    self.webSocket = nil;
}

- (void) webSocket:(SRWebSocket *)webSocket didReceiveMessage:(id)message {
    NSLog(@"Received \"%@\"", message);

    textView.text = message;    
}

- (void)webSocket:(SRWebSocket *)webSocket didCloseWithCode:(NSInteger)code reason:(NSString *)reason wasClean:(BOOL)wasClean;
{
    NSLog(@"WebSocket closed");
    self.webSocket = nil;
}

- (void)webSocket:(SRWebSocket *)webSocket didReceivePong:(NSData *)pongPayload;
{
    NSLog(@"WebSocket received pong");
}

- (void)didReceiveMemoryWarning {
    [super didReceiveMemoryWarning];
    // Dispose of any resources that can be recreated.
}

提前致谢

Answer 1

我成功了。这是导致问题的音频格式设置。我通过检查服务器端文档正确设置了音频。 Big-Endian 引起了问题。如果指定为big-endian，则为big endian。如果你不指定它，那么它就是little-endian。我需要小端。

- (void) setupAudioFormat:(AudioStreamBasicDescription *) format {


  format -> mSampleRate = 16000.0; //
  format -> mFormatID = kAudioFormatLinearPCM; //
  format -> mFramesPerPacket = 1;
  format -> mChannelsPerFrame = 1; //
  format -> mBytesPerFrame = 2;
  format -> mBytesPerPacket = 2;
  format -> mBitsPerChannel = 16; //
  // format -> mReserved = 0;
  format -> mFormatFlags =  kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;

}

在 iOS 中通过使用 AudioQueue 记录的 WebSocket 流式传输音频

Streaming Audio Through WebSocket recorded with AudioQueue in iOS

audio

streaming

core-audio

audioqueue

ios