无法通过语音自适应增强来提高转录准确性
Unable to improve transcription accuracy with speech adaptation boost
我正在使用 SpeechRecognition Python 库执行语音转文本操作。我正在使用 recognize_google_cloud
函数来使用 Google Cloud Speech-to-Text API.
这是我的代码:
import speech_recognition as sr;
import json;
j = '';
with open('key.json', 'r') as f:
j = f.read().replace('\n', '');
js = json.loads(j);
r = sr.Recognizer();
mic = sr.Microphone();
with candide as source:
audio = r.record(source);
print(r.recognize_google_cloud(audio, language='fr-FR', preferred_phrases=['pistoles', 'disait'], credentials_json=j));
函数 recognize_google_cloud
将麦克风捕获的数据发送到 Google API 并且 select 是一组备选方案中给定语音的最可能转录.
如本 page of the documentation 中所述,参数 preferered_phrases
用于 select 另一个包含所列单词的备选方案。
可以使用 speech adaptation boost 改进这些结果。由于 SpeechRecognition 库的版本不允许我们指定增强值,我用硬编码增强值更新了 speech_recognition/__init__.py
文件:
if preferred_phrases is not None:
speech_config["speechContexts"] = {"phrases": preferred_phrases, "boost": 19}
不幸的是,当我执行我的代码时,出现以下错误:
Traceback (most recent call last):
File "/home/pierre/.local/lib/python3.8/site-packages/speech_recognition/__init__.py", line 931, in recognize_google_cloud
response = request.execute()
File "/home/pierre/.local/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/pierre/.local/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://speech.googleapis.com/v1/speech:recognize?alt=json returned "Invalid JSON payload received. Unknown name "boost" at 'config.speech_contexts': Cannot find field.". Details: "[{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'config.speech_contexts', 'description': 'Invalid JSON payload received. Unknown name "boost" at \'config.speech_contexts\': Cannot find field.'}]}]">
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "spech_reco.py", line 23, in <module>
print(r.recognize_google_cloud(audio, language='fr-FR', preferred_phrases=['pistoles', 'disait'], credentials_json=j));
File "/home/pierre/.local/lib/python3.8/site-packages/speech_recognition/__init__.py", line 933, in recognize_google_cloud
raise RequestError(e)
speech_recognition.RequestError: <HttpError 400 when requesting https://speech.googleapis.com/v1/speech:recognize?alt=json returned "Invalid JSON payload received. Unknown name "boost" at 'config.speech_contexts': Cannot find field.". Details: "[{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'config.speech_contexts', 'description': 'Invalid JSON payload received. Unknown name "boost" at \'config.speech_contexts\': Cannot find field.'}]}]">
我的请求有错误吗?
我了解到您正在修改 SpeechRecognition library 的 speech_recognition/__init__.py
文件,以便在您的请求中包含“提升”参数。
查看此文件时,我注意到它使用的是 'v1' version of the API; however, the "boost" parameter is only supported in the ‘v1p1beta1’ version
因此,您可以在代码中进行以下修改:
`speech_service = build ("speech","v1p1beta1", credentials = api_credentials, cache_discovery = False)`
通过此修改,您应该不会再看到 BadRequest
错误。
同时,请注意本库是第三方库,内部使用Google Speech-to-text API。因此,如果这个库不能满足您当前的所有需求,另一种选择可以直接使用 Speech-to-text API Python Client library.
创建您自己的实现
我正在使用 SpeechRecognition Python 库执行语音转文本操作。我正在使用 recognize_google_cloud
函数来使用 Google Cloud Speech-to-Text API.
这是我的代码:
import speech_recognition as sr;
import json;
j = '';
with open('key.json', 'r') as f:
j = f.read().replace('\n', '');
js = json.loads(j);
r = sr.Recognizer();
mic = sr.Microphone();
with candide as source:
audio = r.record(source);
print(r.recognize_google_cloud(audio, language='fr-FR', preferred_phrases=['pistoles', 'disait'], credentials_json=j));
函数 recognize_google_cloud
将麦克风捕获的数据发送到 Google API 并且 select 是一组备选方案中给定语音的最可能转录.
如本 page of the documentation 中所述,参数 preferered_phrases
用于 select 另一个包含所列单词的备选方案。
可以使用 speech adaptation boost 改进这些结果。由于 SpeechRecognition 库的版本不允许我们指定增强值,我用硬编码增强值更新了 speech_recognition/__init__.py
文件:
if preferred_phrases is not None:
speech_config["speechContexts"] = {"phrases": preferred_phrases, "boost": 19}
不幸的是,当我执行我的代码时,出现以下错误:
Traceback (most recent call last):
File "/home/pierre/.local/lib/python3.8/site-packages/speech_recognition/__init__.py", line 931, in recognize_google_cloud
response = request.execute()
File "/home/pierre/.local/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/pierre/.local/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://speech.googleapis.com/v1/speech:recognize?alt=json returned "Invalid JSON payload received. Unknown name "boost" at 'config.speech_contexts': Cannot find field.". Details: "[{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'config.speech_contexts', 'description': 'Invalid JSON payload received. Unknown name "boost" at \'config.speech_contexts\': Cannot find field.'}]}]">
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "spech_reco.py", line 23, in <module>
print(r.recognize_google_cloud(audio, language='fr-FR', preferred_phrases=['pistoles', 'disait'], credentials_json=j));
File "/home/pierre/.local/lib/python3.8/site-packages/speech_recognition/__init__.py", line 933, in recognize_google_cloud
raise RequestError(e)
speech_recognition.RequestError: <HttpError 400 when requesting https://speech.googleapis.com/v1/speech:recognize?alt=json returned "Invalid JSON payload received. Unknown name "boost" at 'config.speech_contexts': Cannot find field.". Details: "[{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'config.speech_contexts', 'description': 'Invalid JSON payload received. Unknown name "boost" at \'config.speech_contexts\': Cannot find field.'}]}]">
我的请求有错误吗?
我了解到您正在修改 SpeechRecognition library 的 speech_recognition/__init__.py
文件,以便在您的请求中包含“提升”参数。
查看此文件时,我注意到它使用的是 'v1' version of the API; however, the "boost" parameter is only supported in the ‘v1p1beta1’ version
因此,您可以在代码中进行以下修改:
`speech_service = build ("speech","v1p1beta1", credentials = api_credentials, cache_discovery = False)`
通过此修改,您应该不会再看到 BadRequest
错误。
同时,请注意本库是第三方库,内部使用Google Speech-to-text API。因此,如果这个库不能满足您当前的所有需求,另一种选择可以直接使用 Speech-to-text API Python Client library.
创建您自己的实现