byte[]可以转换成连续的输入流
byte[] can convert into continuous input stream
Netty 构建的项目 Websocket 服务器
Netty 客户端发送请求:
File file = new File("D:\zh-16000-30s.pcm");
FileInputStream fis = new FileInputStream(file);
int length = 0;
int dataSize = 4096;
byte[] bytes = new byte[dataSize];
int status = 0;
// simulator Andorid or IOS push Streaming
while ((length = fis.read(bytes, 0, dataSize)) != -1) {
JSONObject jsonObject = new JSONObject();
jsonObject.put("audio", Base64.getEncoder().encodeToString(Arrays.copyOf(bytes, length)));\
jsonObject.put("status", status);
WebSocketFrame frame = new TextWebSocketFrame(jsonObject.toJSONString());
ch.writeAndFlush(frame);
status = 1;
}
if(length == -1){
status = 2;
}
if(status == 2){
JSONObject jsonObject = new JSONObject();
jsonObject.put("audio", "");
jsonObject.put("status", status);
WebSocketFrame frame = new TextWebSocketFrame(jsonObject.toJSONString());
ch.writeAndFlush(frame);
}
Netty 服务器处理器:
protected void channelRead0(ChannelHandlerContext ctx, WebSocketFrame frame) throws Exception {
// ping and pong frames already handled
if (frame instanceof TextWebSocketFrame) {
// Send the uppercase string back.
String request = ((TextWebSocketFrame) frame).text();
JSONObject jsonObject = JSONObject.parseObject(request);
Integer status = jsonObject.getInteger("status");
byte[] recByte = Base64.getDecoder().decode(jsonObject.getString("audio"));
if(status.intValue() == 0){
ctx.channel().attr(AttributeKey.newInstance("login")).getAndSet(recByte);
}else if(status.intValue() == 1){
byte[] a = (byte[]) ctx.channel().attr(AttributeKey.valueOf("login")).get();
byte[] c=new byte[a.length+recByte.length];
System.arraycopy(a, 0, c, 0, a.length);
System.arraycopy(recByte, 0, c, a.length, recByte.length);
ctx.channel().attr(AttributeKey.valueOf("login")).getAndSet(c);
}else if(status.intValue() == 2){
// the end of file or streaming
saveAudio((byte[]) ctx.channel().attr(AttributeKey.valueOf("login")).get());
}
ctx.channel().writeAndFlush(new TextWebSocketFrame(request.toUpperCase(Locale.US)));
} else {
String message = "unsupported frame type: " + frame.getClass().getName();
throw new UnsupportedOperationException(message);
}
}
我想使用 Microsoft 语音流识别
示例代码片段:
// Creates an instance of a speech config with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
SpeechConfig config = SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
// Create an audio stream from a wav file.
// Replace with your own audio file name.
PullAudioInputStreamCallback callback = new **WavStream**(new FileInputStream("YourAudioFile.wav"));
AudioConfig audioInput = AudioConfig.fromStreamInput(callback);
代码片段 2:
private final InputStream stream;
public WavStream(InputStream wavStream) {
try {
this.stream = parseWavHeader(wavStream);
} catch (Exception ex) {
throw new IllegalArgumentException(ex.getMessage());
}
}
@Override
public int read(byte[] dataBuffer) {
long ret = 0;
try {
ret = this.stream.read(dataBuffer, 0, dataBuffer.length);
} catch (Exception ex) {
System.out.println("Read " + ex);
}
return (int)Math.max(0, ret);
}
@Override
public void close() {
try {
this.stream.close();
} catch (IOException ex) {
// ignored
}
}
问题:
如何将 byte[] 转换为 inputStream continuous 。
例如:
我讲30s的声音,假设1s等于netty服务器接收一次包
netty服务器发送1s包给微软语音识别
微软语音服务器return中间结果
netty客户端发送完成,微软同时识别
谢谢
您的问题是关于 netty websocket 服务器的吗?或者关于语音 SDK 对象?
对于以这种方式使用语音 SDK,我的建议是使用推流而不是拉流。一般来说,在你这边更容易管理。伪代码:
// FOR SETUP... BEFORE you are accepting audio in your websocket server
// (or on first acceptance of the first packet of audio):
// create push stream
// create audio config from push stream
// create speech config
// create speech recognizer from speech config and audio config
// hook up event handlers for intermediate results (recognizing events)
// hook up event handlers for final results (recognized events)
// start recognition (recognize once or start continuous recognition)
// ON EACH AUDIO packet your websocket server accepts:
// push the audio data into the push stream with
// ON EACH recognizing event, send back the result.text to your client
// ON EACH recognized event, send back the result.text to your client
--抢劫室 [MSFT]
Netty 构建的项目 Websocket 服务器
Netty 客户端发送请求:
File file = new File("D:\zh-16000-30s.pcm");
FileInputStream fis = new FileInputStream(file);
int length = 0;
int dataSize = 4096;
byte[] bytes = new byte[dataSize];
int status = 0;
// simulator Andorid or IOS push Streaming
while ((length = fis.read(bytes, 0, dataSize)) != -1) {
JSONObject jsonObject = new JSONObject();
jsonObject.put("audio", Base64.getEncoder().encodeToString(Arrays.copyOf(bytes, length)));\
jsonObject.put("status", status);
WebSocketFrame frame = new TextWebSocketFrame(jsonObject.toJSONString());
ch.writeAndFlush(frame);
status = 1;
}
if(length == -1){
status = 2;
}
if(status == 2){
JSONObject jsonObject = new JSONObject();
jsonObject.put("audio", "");
jsonObject.put("status", status);
WebSocketFrame frame = new TextWebSocketFrame(jsonObject.toJSONString());
ch.writeAndFlush(frame);
}
Netty 服务器处理器:
protected void channelRead0(ChannelHandlerContext ctx, WebSocketFrame frame) throws Exception {
// ping and pong frames already handled
if (frame instanceof TextWebSocketFrame) {
// Send the uppercase string back.
String request = ((TextWebSocketFrame) frame).text();
JSONObject jsonObject = JSONObject.parseObject(request);
Integer status = jsonObject.getInteger("status");
byte[] recByte = Base64.getDecoder().decode(jsonObject.getString("audio"));
if(status.intValue() == 0){
ctx.channel().attr(AttributeKey.newInstance("login")).getAndSet(recByte);
}else if(status.intValue() == 1){
byte[] a = (byte[]) ctx.channel().attr(AttributeKey.valueOf("login")).get();
byte[] c=new byte[a.length+recByte.length];
System.arraycopy(a, 0, c, 0, a.length);
System.arraycopy(recByte, 0, c, a.length, recByte.length);
ctx.channel().attr(AttributeKey.valueOf("login")).getAndSet(c);
}else if(status.intValue() == 2){
// the end of file or streaming
saveAudio((byte[]) ctx.channel().attr(AttributeKey.valueOf("login")).get());
}
ctx.channel().writeAndFlush(new TextWebSocketFrame(request.toUpperCase(Locale.US)));
} else {
String message = "unsupported frame type: " + frame.getClass().getName();
throw new UnsupportedOperationException(message);
}
}
我想使用 Microsoft 语音流识别
示例代码片段:
// Creates an instance of a speech config with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
SpeechConfig config = SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
// Create an audio stream from a wav file.
// Replace with your own audio file name.
PullAudioInputStreamCallback callback = new **WavStream**(new FileInputStream("YourAudioFile.wav"));
AudioConfig audioInput = AudioConfig.fromStreamInput(callback);
代码片段 2:
private final InputStream stream;
public WavStream(InputStream wavStream) {
try {
this.stream = parseWavHeader(wavStream);
} catch (Exception ex) {
throw new IllegalArgumentException(ex.getMessage());
}
}
@Override
public int read(byte[] dataBuffer) {
long ret = 0;
try {
ret = this.stream.read(dataBuffer, 0, dataBuffer.length);
} catch (Exception ex) {
System.out.println("Read " + ex);
}
return (int)Math.max(0, ret);
}
@Override
public void close() {
try {
this.stream.close();
} catch (IOException ex) {
// ignored
}
}
问题:
如何将 byte[] 转换为 inputStream continuous 。
例如:
我讲30s的声音,假设1s等于netty服务器接收一次包
netty服务器发送1s包给微软语音识别
微软语音服务器return中间结果
netty客户端发送完成,微软同时识别
谢谢
您的问题是关于 netty websocket 服务器的吗?或者关于语音 SDK 对象?
对于以这种方式使用语音 SDK,我的建议是使用推流而不是拉流。一般来说,在你这边更容易管理。伪代码:
// FOR SETUP... BEFORE you are accepting audio in your websocket server
// (or on first acceptance of the first packet of audio):
// create push stream
// create audio config from push stream
// create speech config
// create speech recognizer from speech config and audio config
// hook up event handlers for intermediate results (recognizing events)
// hook up event handlers for final results (recognized events)
// start recognition (recognize once or start continuous recognition)
// ON EACH AUDIO packet your websocket server accepts:
// push the audio data into the push stream with
// ON EACH recognizing event, send back the result.text to your client
// ON EACH recognized event, send back the result.text to your client
--抢劫室 [MSFT]