我们如何使用 python 将存储容器中的输入文件提供给 azure speech api

How can we give the input file from storage container to azure speech api using python

下面是代码,

call_name1="test.wav"
blob_client1=blob_service_client.get_blob_client("bucket/audio",call_name1)
print(blob_client1)

streamdownloader=blob_client1.download_blob()
stream = BytesIO()
streamfinal=streamdownloader.download_to_stream(stream)
print(streamfinal)

speech_key, service_region = "12345", "eastus"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

audio_input = speechsdk.audio.AudioConfig(filename=streamfinal)

错误,

TypeError                                 Traceback (most recent call last)
<ipython-input-6-a402ae91606a> in <module>
     44 speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
     45
---> 46 audio_input = speechsdk.audio.AudioConfig(filename=streamfinal)

C:\ProgramData\Anaconda3\lib\site-packages\azure\cognitiveservices\speech\audio.py in __init__(self, use_default_microphone, filename, stream, device_name)
    213
    214         if filename is not None:
--> 215             self._impl = impl.AudioConfig._from_wav_file_input(filename)
    216             return
    217         if stream is not None:

TypeError: in method 'AudioConfig__from_wav_file_input', argument 1 of type 'std::string const &'

请帮助我们从存储容器中读取音频文件作为 Azure 语音的输入 api。谢谢!!

正如 ewong 在评论中所说,您需要获取 stream 而不是字符串。

download_to_stream 用于将此 blob 的内容下载到流中。但不是 azure.cognitiveservices.speech.audio.AudioInputStream AudioConfig 需要的。

我找不到将流转换为 AudioInputStream 的解决方法。所以,好像只有从Storage Blob下载音频文件到本地,再通过AudioConfig上传的方式。

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import azure.cognitiveservices.speech as speechsdk

filename = "test.txt"
container_name="test-container"

blob_service_client = BlobServiceClient.from_connection_string("DefaultEndpointsProtocol=https;AccountName=pamelastorage;AccountKey=UOyhItMnWJmB54Jmj8U0YtStNFk0vZyN1+nRem9+JwqNVJEMh5deerdfLbhVQl0ztmg96UZEUtRh2HVp8+ZJWA==;EndpointSuffix=core.windows.net")
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)

with open(filename, "wb") as f:
    data = blob_client.download_blob()
    data.readinto(f)

audio_input = speechsdk.audio.AudioConfig(filename=filename)
print(audio_input)