我如何使用 websocket 将音频发送到 Microsoft Translator
How can i send Audio using websocket to Microsoft Translator
我已经创建了一个应用程序来将文本翻译成文本以及将语音翻译成文本。
我已经完成了文本到文本和文本到 speech.i 我没有将语音翻译成文本。
我正在使用这个演示https://github.com/bitmapdata/MSTranslateVendor它只会文本到文本和文本到语音。
我在堆栈溢出中搜索它会给我解决方案,比如使用 websocket 发送音频,但我不知道如何发送 it.And 我是 websocket 编程的新手。
请帮助我如何使用 websocket 发送音频。
我正在按照下面的方法创建音频,但我不知道如何发送它。
- (void)viewDidLoad {
[super viewDidLoad];
settings = [[NSMutableDictionary alloc] init];
[settings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
[settings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
[settings setValue:[NSNumber numberWithInt: 2] forKey:AVNumberOfChannelsKey];
[settings setValue:[NSNumber numberWithInt: 16] forKey:AVLinearPCMBitDepthKey];
[settings setValue:[NSNumber numberWithBool: NO] forKey:AVLinearPCMIsBigEndianKey];
[settings setValue:[NSNumber numberWithBool: NO] forKey:AVLinearPCMIsFloatKey];
[settings setValue:[NSNumber numberWithInt: AVAudioQualityHigh] forKey:AVEncoderAudioQualityKey];
NSArray *pathComponents = [NSArray arrayWithObjects:
[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject],
@"Sohil.wav",
nil];
outputFileURL = [NSURL fileURLWithPathComponents:pathComponents];
NSLog(@"Record URL : %@",outputFileURL);
// Setup audio session
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryPlayAndRecord error:nil];
// Initiate and prepare the recorder
recorder = [[AVAudioRecorder alloc] initWithURL:outputFileURL settings:settings error:nil];
recorder.delegate = self;
recorder.meteringEnabled = YES;
[recorder prepareToRecord];
}
- (IBAction)recordStart:(id)sender {
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setActive:YES error:nil];
[recorder record];
}
- (IBAction)recordStop:(id)sender {
[recorder stop];
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setActive:NO error:nil];
}
并转换它:
-(NSData*) stripAndAddWavHeader:(NSData*) wav {
unsigned long wavDataSize = [wav length] - 44;
NSData *WaveFile= [NSMutableData dataWithData:[wav subdataWithRange:NSMakeRange(44, wavDataSize)]];
NSMutableData *newWavData;
newWavData = [self addWavHeader:WaveFile];
return newWavData;
}
- (NSMutableData *)addWavHeader:(NSData *)wavNoheader {
int headerSize = 44;
long totalAudioLen = [wavNoheader length];
long totalDataLen = [wavNoheader length] + headerSize-8;
long longSampleRate = 22050.0;
int channels = 1;
long byteRate = 8 * 44100.0 * channels/8;
Byte *header = (Byte*)malloc(44);
header[0] = 'R'; // RIFF/WAVE header
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
header[4] = (Byte) (totalDataLen & 0xff);
header[5] = (Byte) ((totalDataLen >> 8) & 0xff);
header[6] = (Byte) ((totalDataLen >> 16) & 0xff);
header[7] = (Byte) ((totalDataLen >> 24) & 0xff);
header[8] = 'W';
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
header[12] = 'f'; // 'fmt ' chunk
header[13] = 'm';
header[14] = 't';
header[15] = ' ';
header[16] = 16; // 4 bytes: size of 'fmt ' chunk
header[17] = 0;
header[18] = 0;
header[19] = 0;
header[20] = 1; // format = 1
header[21] = 0;
header[22] = (Byte) channels;
header[23] = 0;
header[24] = (Byte) (longSampleRate & 0xff);
header[25] = (Byte) ((longSampleRate >> 8) & 0xff);
header[26] = (Byte) ((longSampleRate >> 16) & 0xff);
header[27] = (Byte) ((longSampleRate >> 24) & 0xff);
header[28] = (Byte) (byteRate & 0xff);
header[29] = (Byte) ((byteRate >> 8) & 0xff);
header[30] = (Byte) ((byteRate >> 16) & 0xff);
header[31] = (Byte) ((byteRate >> 24) & 0xff);
header[32] = (Byte) (2 * 8 / 8); // block align
header[33] = 0;
header[34] = 16; // bits per sample
header[35] = 0;
header[36] = 'd';
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
header[40] = (Byte) (totalAudioLen & 0xff);
header[41] = (Byte) ((totalAudioLen >> 8) & 0xff);
header[42] = (Byte) ((totalAudioLen >> 16) & 0xff);
header[43] = (Byte) ((totalAudioLen >> 24) & 0xff);
NSMutableData *newWavData = [NSMutableData dataWithBytes:header length:44];
[newWavData appendBytes:[wavNoheader bytes] length:[wavNoheader length]];
return newWavData;
}
您可以使用 Microsoft Cognitive-Speech-STT-iOS 其完美的 Speech-To-Text。
1) 首先,您需要在 Register App
上注册您的应用
2) 现在您想要订阅密钥 Bing 演讲 - 在 setting.plist 文件上的演示项目中预览使用此密钥可以正常工作。你可以得到两把钥匙使用任何一把钥匙。
我已经创建了一个应用程序来将文本翻译成文本以及将语音翻译成文本。 我已经完成了文本到文本和文本到 speech.i 我没有将语音翻译成文本。
我正在使用这个演示https://github.com/bitmapdata/MSTranslateVendor它只会文本到文本和文本到语音。
我在堆栈溢出中搜索它会给我解决方案,比如使用 websocket 发送音频,但我不知道如何发送 it.And 我是 websocket 编程的新手。
请帮助我如何使用 websocket 发送音频。
我正在按照下面的方法创建音频,但我不知道如何发送它。
- (void)viewDidLoad {
[super viewDidLoad];
settings = [[NSMutableDictionary alloc] init];
[settings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
[settings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
[settings setValue:[NSNumber numberWithInt: 2] forKey:AVNumberOfChannelsKey];
[settings setValue:[NSNumber numberWithInt: 16] forKey:AVLinearPCMBitDepthKey];
[settings setValue:[NSNumber numberWithBool: NO] forKey:AVLinearPCMIsBigEndianKey];
[settings setValue:[NSNumber numberWithBool: NO] forKey:AVLinearPCMIsFloatKey];
[settings setValue:[NSNumber numberWithInt: AVAudioQualityHigh] forKey:AVEncoderAudioQualityKey];
NSArray *pathComponents = [NSArray arrayWithObjects:
[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject],
@"Sohil.wav",
nil];
outputFileURL = [NSURL fileURLWithPathComponents:pathComponents];
NSLog(@"Record URL : %@",outputFileURL);
// Setup audio session
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryPlayAndRecord error:nil];
// Initiate and prepare the recorder
recorder = [[AVAudioRecorder alloc] initWithURL:outputFileURL settings:settings error:nil];
recorder.delegate = self;
recorder.meteringEnabled = YES;
[recorder prepareToRecord];
}
- (IBAction)recordStart:(id)sender {
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setActive:YES error:nil];
[recorder record];
}
- (IBAction)recordStop:(id)sender {
[recorder stop];
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setActive:NO error:nil];
}
并转换它:
-(NSData*) stripAndAddWavHeader:(NSData*) wav {
unsigned long wavDataSize = [wav length] - 44;
NSData *WaveFile= [NSMutableData dataWithData:[wav subdataWithRange:NSMakeRange(44, wavDataSize)]];
NSMutableData *newWavData;
newWavData = [self addWavHeader:WaveFile];
return newWavData;
}
- (NSMutableData *)addWavHeader:(NSData *)wavNoheader {
int headerSize = 44;
long totalAudioLen = [wavNoheader length];
long totalDataLen = [wavNoheader length] + headerSize-8;
long longSampleRate = 22050.0;
int channels = 1;
long byteRate = 8 * 44100.0 * channels/8;
Byte *header = (Byte*)malloc(44);
header[0] = 'R'; // RIFF/WAVE header
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
header[4] = (Byte) (totalDataLen & 0xff);
header[5] = (Byte) ((totalDataLen >> 8) & 0xff);
header[6] = (Byte) ((totalDataLen >> 16) & 0xff);
header[7] = (Byte) ((totalDataLen >> 24) & 0xff);
header[8] = 'W';
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
header[12] = 'f'; // 'fmt ' chunk
header[13] = 'm';
header[14] = 't';
header[15] = ' ';
header[16] = 16; // 4 bytes: size of 'fmt ' chunk
header[17] = 0;
header[18] = 0;
header[19] = 0;
header[20] = 1; // format = 1
header[21] = 0;
header[22] = (Byte) channels;
header[23] = 0;
header[24] = (Byte) (longSampleRate & 0xff);
header[25] = (Byte) ((longSampleRate >> 8) & 0xff);
header[26] = (Byte) ((longSampleRate >> 16) & 0xff);
header[27] = (Byte) ((longSampleRate >> 24) & 0xff);
header[28] = (Byte) (byteRate & 0xff);
header[29] = (Byte) ((byteRate >> 8) & 0xff);
header[30] = (Byte) ((byteRate >> 16) & 0xff);
header[31] = (Byte) ((byteRate >> 24) & 0xff);
header[32] = (Byte) (2 * 8 / 8); // block align
header[33] = 0;
header[34] = 16; // bits per sample
header[35] = 0;
header[36] = 'd';
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
header[40] = (Byte) (totalAudioLen & 0xff);
header[41] = (Byte) ((totalAudioLen >> 8) & 0xff);
header[42] = (Byte) ((totalAudioLen >> 16) & 0xff);
header[43] = (Byte) ((totalAudioLen >> 24) & 0xff);
NSMutableData *newWavData = [NSMutableData dataWithBytes:header length:44];
[newWavData appendBytes:[wavNoheader bytes] length:[wavNoheader length]];
return newWavData;
}
您可以使用 Microsoft Cognitive-Speech-STT-iOS 其完美的 Speech-To-Text。
1) 首先,您需要在 Register App
上注册您的应用2) 现在您想要订阅密钥 Bing 演讲 - 在 setting.plist 文件上的演示项目中预览使用此密钥可以正常工作。你可以得到两把钥匙使用任何一把钥匙。