IBM Watson Speech To Text 未获得预期结果
Not getting the expected result with IBM Watson Speech To Text
尝试在标准 IBM Watson S2T 模型上测试 mp3 文件时,我得到以下输出:
<bound method DetailedResponse.get_result of <ibm_cloud_sdk_core.detailed_response.DetailedResponse object at 0x00000250B1853700>>
这不是错误,但也不是我想要的输出。
这是我的代码:
api = IAMAuthenticator(api_key)
speech_to_text = SpeechToTextV1(authenticator=api)
speech_to_text.set_service_url(url)
with open(mp3-file, "rb") as audio_file:
result = speech_to_text.recognize(
model='de-DE_BroadbandModel', audio=audio_file, content_type="audio/mp3"
).get_result
print(result)
我对这个话题很陌生,还没有真正弄清楚参数是什么。我希望有一个像
这样的输出
{'result': [...]}
我关注了this tutorial。
我做错了什么?
这是使用示例对我有用的代码 audio_file2.mp3
import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{api_key}')
speech_to_text = SpeechToTextV1(
authenticator=authenticator
)
speech_to_text.set_service_url('{url}')
with open(join(dirname(__file__), './.', 'audio-file2.mp3'),
'rb') as audio_file:
speech_recognition_results = speech_to_text.recognize(
audio=audio_file,
content_type='audio/mp3',
word_alternatives_threshold=0.9
).get_result()
print(json.dumps(speech_recognition_results, indent=2))
步骤:
- 创建 Watson 后 Speech-to-Text service
- 将 Python 代码中的
{url}
和 {api_key}
替换为 speech-to-text 服务凭据。
- 保存代码为
speech-to-text.py
的文件。
- 从命令提示符或终端,运行
pip install ibm-watson
然后 python speech-to-text.py
查看类似于下面所示的结果
更多选项请参考speech-to-text api docs。
{
"result_index": 0,
"results": [
{
"final": true,
"alternatives": [
{
"transcript": "a line of severe thunderstorms with several possible tornadoes is approaching Colorado on Sunday ",
"confidence": 1.0
}
],
"word_alternatives": [
{
"start_time": 0.2,
"end_time": 0.35,
"alternatives": [
{
"word": "a",
"confidence": 0.94
}
]
},
{
"start_time": 0.35,
"end_time": 0.69,
"alternatives": [
{
"word": "line",
"confidence": 0.94
}
]
},
{
"start_time": 0.69,
"end_time": 0.78,
"alternatives": [
{
"word": "of",
"confidence": 1.0
}
]
},
{
"start_time": 0.78,
"end_time": 1.13,
"alternatives": [
{
"word": "severe",
"confidence": 1.0
}
]
},
{
"start_time": 1.13,
"end_time": 1.9,
"alternatives": [
{
"word": "thunderstorms",
"confidence": 1.0
}
]
},
{
"start_time": 4.0,
"end_time": 4.18,
"alternatives": [
{
"word": "is",
"confidence": 1.0
}
]
},
{
"start_time": 4.18,
"end_time": 4.63,
"alternatives": [
{
"word": "approaching",
"confidence": 1.0
}
]
},
{
"start_time": 4.63,
"end_time": 5.21,
"alternatives": [
{
"word": "Colorado",
"confidence": 0.93
}
]
},
{
"start_time": 5.21,
"end_time": 5.37,
"alternatives": [
{
"word": "on",
"confidence": 0.93
}
]
},
{
"start_time": 5.37,
"end_time": 6.09,
"alternatives": [
{
"word": "Sunday",
"confidence": 0.94
}
]
}
]
}
]
}
get_result
是一个方法,所以你需要调用它。您正在打印方法,而不是调用它。因此你的输出显示
<bound method DetailedResponse.get_result ...
几个括号应该可以解决它。
with open(mp3-file, "rb") as audio_file:
result = speech_to_text.recognize(
model='de-DE_BroadbandModel', audio=audio_file, content_type="audio/mp3"
).get_result()
尝试在标准 IBM Watson S2T 模型上测试 mp3 文件时,我得到以下输出:
<bound method DetailedResponse.get_result of <ibm_cloud_sdk_core.detailed_response.DetailedResponse object at 0x00000250B1853700>>
这不是错误,但也不是我想要的输出。
这是我的代码:
api = IAMAuthenticator(api_key)
speech_to_text = SpeechToTextV1(authenticator=api)
speech_to_text.set_service_url(url)
with open(mp3-file, "rb") as audio_file:
result = speech_to_text.recognize(
model='de-DE_BroadbandModel', audio=audio_file, content_type="audio/mp3"
).get_result
print(result)
我对这个话题很陌生,还没有真正弄清楚参数是什么。我希望有一个像
这样的输出{'result': [...]}
我关注了this tutorial。 我做错了什么?
这是使用示例对我有用的代码 audio_file2.mp3
import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{api_key}')
speech_to_text = SpeechToTextV1(
authenticator=authenticator
)
speech_to_text.set_service_url('{url}')
with open(join(dirname(__file__), './.', 'audio-file2.mp3'),
'rb') as audio_file:
speech_recognition_results = speech_to_text.recognize(
audio=audio_file,
content_type='audio/mp3',
word_alternatives_threshold=0.9
).get_result()
print(json.dumps(speech_recognition_results, indent=2))
步骤:
- 创建 Watson 后 Speech-to-Text service
- 将 Python 代码中的
{url}
和{api_key}
替换为 speech-to-text 服务凭据。 - 保存代码为
speech-to-text.py
的文件。 - 从命令提示符或终端,运行
pip install ibm-watson
然后python speech-to-text.py
查看类似于下面所示的结果
更多选项请参考speech-to-text api docs。
{
"result_index": 0,
"results": [
{
"final": true,
"alternatives": [
{
"transcript": "a line of severe thunderstorms with several possible tornadoes is approaching Colorado on Sunday ",
"confidence": 1.0
}
],
"word_alternatives": [
{
"start_time": 0.2,
"end_time": 0.35,
"alternatives": [
{
"word": "a",
"confidence": 0.94
}
]
},
{
"start_time": 0.35,
"end_time": 0.69,
"alternatives": [
{
"word": "line",
"confidence": 0.94
}
]
},
{
"start_time": 0.69,
"end_time": 0.78,
"alternatives": [
{
"word": "of",
"confidence": 1.0
}
]
},
{
"start_time": 0.78,
"end_time": 1.13,
"alternatives": [
{
"word": "severe",
"confidence": 1.0
}
]
},
{
"start_time": 1.13,
"end_time": 1.9,
"alternatives": [
{
"word": "thunderstorms",
"confidence": 1.0
}
]
},
{
"start_time": 4.0,
"end_time": 4.18,
"alternatives": [
{
"word": "is",
"confidence": 1.0
}
]
},
{
"start_time": 4.18,
"end_time": 4.63,
"alternatives": [
{
"word": "approaching",
"confidence": 1.0
}
]
},
{
"start_time": 4.63,
"end_time": 5.21,
"alternatives": [
{
"word": "Colorado",
"confidence": 0.93
}
]
},
{
"start_time": 5.21,
"end_time": 5.37,
"alternatives": [
{
"word": "on",
"confidence": 0.93
}
]
},
{
"start_time": 5.37,
"end_time": 6.09,
"alternatives": [
{
"word": "Sunday",
"confidence": 0.94
}
]
}
]
}
]
}
get_result
是一个方法,所以你需要调用它。您正在打印方法,而不是调用它。因此你的输出显示
<bound method DetailedResponse.get_result ...
几个括号应该可以解决它。
with open(mp3-file, "rb") as audio_file:
result = speech_to_text.recognize(
model='de-DE_BroadbandModel', audio=audio_file, content_type="audio/mp3"
).get_result()