Speech-to-Text:无法转录长音频文件:"google.api_core.future.polling._OperationNotComplete"

Speech-to-Text: Cannot transcribe long audio files: "google.api_core.future.polling._OperationNotComplete"

我正在使用 Google Speech-to-Text API 来转录一段 25 分钟长的音频。我已将 transcribe_async.py code 用于此类任务,因为它适用于长音频文件。

我正在使用 Ubuntu 16.04 和 Python 3.5.2。该代码当然适用于 1 分钟长的音频文件。

错误信息如下所示。我无法确定问题的根源。

Waiting for operation to complete...
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/retry.py", line 177, in retry_target
    return target()
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 74, in _done_or_raise
    raise _OperationNotComplete()
google.api_core.future.polling._OperationNotComplete

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 94, in _blocking_poll
    retry_(self._done_or_raise)()
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/retry.py", line 260, in retry_wrapped_func
    on_error=on_error,
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/retry.py", line 195, in retry_target
    last_exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.RetryError: Deadline of 90.0s exceeded while calling functools.partial(<bound method PollingFuture._done_or_raise of <google.api_core.operation.Operation object at 0x7f10bdf7bef0>>), last exception:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "transcribe_async.py", line 110, in <module>
    transcribe_gcs(args.path, args.outpath)
  File "transcribe_async.py", line 85, in transcribe_gcs
    response = operation.result(timeout=90)
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 115, in result
    self._blocking_poll(timeout=timeout)
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 97, in _blocking_poll
    'Operation did not complete within the designated '
concurrent.futures._base.TimeoutError: Operation did not complete within the designated timeout.

这个问题似乎是因为转录过程需要超过 90 秒才能执行。我建议您尝试将 timeout 属性 增加到一个更大的数字,具体取决于音频文件的长度,以便为服务提供足够的时间来执行转录。

需要修改的代码(transcribe_async.py中第81行)

response = operation.result(timeout=90)