Google 语音 - 流请求 Returns EOF 错误

Google Speech - Streaming Request Returns EOF Error

使用 Go,我正在使用 RTMP 流,将其转码为 FLAC(使用 ffmpeg)并尝试流式传输到 Google 的语音 API 以转录音频。但是,我在发送数据时不断收到 EOF 错误。我在文档中找不到关于此错误的任何信息,所以我不确定是什么原因造成的。

我正在将接收到的数据分块为 3 秒的片段(长度不相关,只要它小于流识别请求的最大长度)。

这是我的代码的核心:

func main() {

    done := make(chan os.Signal)
    received := make(chan []byte)

    go receive(received)
    go transcribe(received)

    signal.Notify(done, os.Interrupt, syscall.SIGTERM)

    select {
    case <-done:
        os.Exit(0)
    }
}

func receive(received chan<- []byte) {
    var b bytes.Buffer
    stdout := bufio.NewWriter(&b)

    cmd := exec.Command("ffmpeg", "-i", "rtmp://127.0.0.1:1935/live/key", "-f", "flac", "-ar", "16000", "-")
    cmd.Stdout = stdout

    if err := cmd.Start(); err != nil {
        log.Fatal(err)
    }

    duration, _ := time.ParseDuration("3s")
    ticker := time.NewTicker(duration)

    for {
        select {
        case <-ticker.C:
            stdout.Flush()
            log.Printf("Received %d bytes", b.Len())
            received <- b.Bytes()
            b.Reset()
        }
    }
}

func transcribe(received <-chan []byte) {
    ctx := context.TODO()

    client, err := speech.NewClient(ctx)
    if err != nil {
        log.Fatal(err)
    }

    stream, err := client.StreamingRecognize(ctx)
    if err != nil {
        log.Fatal(err)
    }

    // Send the initial configuration message.
    if err = stream.Send(&speechpb.StreamingRecognizeRequest{
        StreamingRequest: &speechpb.StreamingRecognizeRequest_StreamingConfig{
            StreamingConfig: &speechpb.StreamingRecognitionConfig{
                Config: &speechpb.RecognitionConfig{
                    Encoding:        speechpb.RecognitionConfig_FLAC,
                    LanguageCode:    "en-GB",
                    SampleRateHertz: 16000,
                },
            },
        },
    }); err != nil {
        log.Fatal(err)
    }

    for {
        select {
        case data := <-received:
            if len(data) > 0 {
                log.Printf("Sending %d bytes", len(data))
                if err := stream.Send(&speechpb.StreamingRecognizeRequest{
                    StreamingRequest: &speechpb.StreamingRecognizeRequest_AudioContent{
                        AudioContent: data,
                    },
                }); err != nil {
                    log.Printf("Could not send audio: %v", err)
                }
            }
        }
    }
}

运行 此代码给出此输出:

2017/10/09 16:05:00 Received 191704 bytes
2017/10/09 16:05:00 Saving 191704 bytes
2017/10/09 16:05:00 Sending 191704 bytes
2017/10/09 16:05:00 Could not send audio: EOF

2017/10/09 16:05:03 Received 193192 bytes
2017/10/09 16:05:03 Saving 193192 bytes
2017/10/09 16:05:03 Sending 193192 bytes
2017/10/09 16:05:03 Could not send audio: EOF

2017/10/09 16:05:06 Received 193188 bytes
2017/10/09 16:05:06 Saving 193188 bytes
2017/10/09 16:05:06 Sending 193188 bytes // Notice that this doesn't error

2017/10/09 16:05:09 Received 191704 bytes
2017/10/09 16:05:09 Saving 191704 bytes
2017/10/09 16:05:09 Sending 191704 bytes
2017/10/09 16:05:09 Could not send audio: EOF

请注意,并非所有 Send 都会失败。

有人能给我指出正确的方向吗?这与 FLAC headers 或其他什么有关吗?我还想知道重置缓冲区是否会导致某些数据被删除(即这是一个 non-trivial 操作,实际上需要一些时间才能完成)并且它不喜欢这种丢失的信息?

任何帮助将不胜感激。

所以,事实证明有一种方法可以获取有关流状态的更多信息,因此我们不必只依赖于返回的错误。

if err := stream.Send(&speechpb.StreamingRecognizeRequest{
    StreamingRequest: &speechpb.StreamingRecognizeRequest_AudioContent{
        AudioContent: data,
    },
}); err != nil {
    resp, err := stream.Recv()
    log.Printf("Could not send audio: %v", resp.GetError())
}

这会打印:

2017/10/16 17:14:53 Could not send audio: code:3 message:"Invalid audio content: too long."

这是更有帮助的错误消息!