在 aws 转录工作中获取字幕
Get subtitles in aws transcribe job
我正在创建一个从 aws 转录作业获取转录输出的函数。
def get_text(job_name, file_uri):
job_name = job_name
file_uri = file_uri
transcribe_client = boto3.client('transcribe')
max_tries = 60
while max_tries > 0:
max_tries -= 1
job = transcribe_client.get_transcription_job(TranscriptionJobName=job_name)
job_status = job['TranscriptionJob']['TranscriptionJobStatus']
if job_status in ['COMPLETED', 'FAILED']:
print(f"Job {job_name} is {job_status}.")
if job_status == 'COMPLETED':
response = urllib.request.urlopen(job['TranscriptionJob']['Transcript']['TranscriptFileUri'])
data = json.loads(response.read())
print(data)
text = data['results']['transcripts'][0]['transcript']
break
else:
print(f"Waiting for {job_name}. Current status is {job_status}.")
time.sleep(10)
return text
现在我得到了完美的输出,但是当我将行 job['TranscriptionJob']['Transcript']['TranscriptFileUri']
更改为 job['TranscriptionJob']['Subtitles']['SubtitleFileUris']
时,我得到了错误输出。
遇到这种情况怎么办。
job['TranscriptionJob']['Subtitles']['SubtitleFileUris']
是一个 URI 列表,而不是单个 URI。您需要将代码更改为类似这样的内容
if job_status == 'COMPLETED':
for uri in job['TranscriptionJob']['Transcript']['SubtitleFileUris']:
response = urllib.request.urlopen(uri)
data = json.loads(response.read())
print(data)
text = data['results']['transcripts'][0]['transcript']
我正在创建一个从 aws 转录作业获取转录输出的函数。
def get_text(job_name, file_uri):
job_name = job_name
file_uri = file_uri
transcribe_client = boto3.client('transcribe')
max_tries = 60
while max_tries > 0:
max_tries -= 1
job = transcribe_client.get_transcription_job(TranscriptionJobName=job_name)
job_status = job['TranscriptionJob']['TranscriptionJobStatus']
if job_status in ['COMPLETED', 'FAILED']:
print(f"Job {job_name} is {job_status}.")
if job_status == 'COMPLETED':
response = urllib.request.urlopen(job['TranscriptionJob']['Transcript']['TranscriptFileUri'])
data = json.loads(response.read())
print(data)
text = data['results']['transcripts'][0]['transcript']
break
else:
print(f"Waiting for {job_name}. Current status is {job_status}.")
time.sleep(10)
return text
现在我得到了完美的输出,但是当我将行 job['TranscriptionJob']['Transcript']['TranscriptFileUri']
更改为 job['TranscriptionJob']['Subtitles']['SubtitleFileUris']
时,我得到了错误输出。
遇到这种情况怎么办。
job['TranscriptionJob']['Subtitles']['SubtitleFileUris']
是一个 URI 列表,而不是单个 URI。您需要将代码更改为类似这样的内容
if job_status == 'COMPLETED':
for uri in job['TranscriptionJob']['Transcript']['SubtitleFileUris']:
response = urllib.request.urlopen(uri)
data = json.loads(response.read())
print(data)
text = data['results']['transcripts'][0]['transcript']