将 google 云语音 api 的音频文件重新编码为 linear16 失败,并显示“[Errno 30] 只读文件系统”

Re-encoding audio file to linear16 for google cloud speech api fails with '[Errno 30] Read-only file system'

我正在尝试使用 FFmpeg 模块将音频文件转换为线性 16 格式。我已将音频文件存储在一个云存储桶中,并想将转换后的文件移动到另一个存储桶中。该代码在 VS 代码中完美运行,并成功部署到云功能。但是,在云上 运行 时失败并显示 [Errno 30] 只读文件系统。


from google.cloud import speech
from google.cloud import storage
import ffmpeg
import sys

out_bucket = 'encoded_audio_landing'
input_bucket_name = 'audio_landing'

def process_audio(input_bucket_name, in_filename, out_bucket):
    converts audio encoding for GSK call center call recordings to linear16 encoding and 16,000
    hertz sample rate

        in_filename: a gsk call audio file

    returns an audio file encoded so that google speech to text api can transcribe
    storage_client = storage.Client()
    bucket = storage_client.bucket(input_bucket_name)

    blob = bucket.blob(in_filename)
    print('type contents: ', type('processedfile'))
    #print('blob name / len / type', blob.name, len(blob.name), type(blob.name))

        out, err = (
            .output('pipe: a', format="s16le", acodec="pcm_s16le", ac=1, ar="16k")
            .run(capture_stdout=True, capture_stderr=True)
    except ffmpeg.Error as e:
        print(e.stderr, file=sys.stderr)

    up_bucket = storage_client.bucket(out_bucket)
    up_blob = up_bucket.blob(blob.name)
    #print('type / len out', type(out), len(out))

    #delete source file

def hello_gcs(event, context):
    """Background Cloud Function to be triggered by Cloud Storage.
       This generic function logs relevant data when a file is changed,
       and works for all Cloud Storage CRUD operations.
        event (dict):  The dictionary with data specific to this type of event.
                       The `data` field contains a description of the event in
                       the Cloud Storage `object` format described here:
        context (google.cloud.functions.Context): Metadata of triggering event.
        None; the output is written to Cloud Logging

    #print('Event ID: {}'.format(context.event_id))
    #print('Event type: {}'.format(context.event_type))
    print('Bucket: {}'.format(event['bucket']))
    print('File: {}'.format(event['name']))
    print('Metageneration: {}'.format(event['metageneration']))
    #print('Created: {}'.format(event['timeCreated']))
    #print('Updated: {}'.format(event['updated']))

    #convert audio encoding
    print('begin process_audio')
    process_audio(input_bucket_name, event['name'], out_bucket)

问题是我正在将文件下载到我的本地目录,这显然无法在云端运行。我读了另一篇文章,其中有人使用添加了获取文件路径函数并将其用作 blob.download_tofilename() 的输入。我不确定为什么会这样。

我确实尝试删除整个 download_tofilename 位,但没有它就无法工作。


#this gets around downloading the file to a local folder. it creates some sort of templ location
def get_file_path(filename):
    file_name = secure_filename(filename)
    return os.path.join(tempfile.gettempdir(), file_name)

def process_audio(input_bucket_name, in_filename, out_bucket):
    converts audio encoding for GSK call center call recordings to linear16 encoding and 16,000
    hertz sample rate

        in_filename: a gsk call audio file
        input_bucket_name: location of the sourcefile that needs to be re-encoded
        out_bucket: where to put the newly encoded file

    returns an audio file encoded so that google speech to text api can transcribe
    storage_client = storage.Client()
    bucket = storage_client.bucket(input_bucket_name)

    blob = bucket.blob(in_filename)


    #creates some sort of temp loaction for the tile
    file_path = get_file_path(blob.name)
    print('type contents: ', type('processedfile'))
    #print('blob name / len / type', blob.name, len(blob.name), type(blob.name))

    #envokes the ffmpeg library to re-encode the audio file, it's actually some sort of command line application
    #   that is available in Python and google cloud. The things in the .outuput bit are options from ffmpeg, you
    #   pass these options into ffmpeg there
        out, err = (
            .output('pipe: a', format="s16le", acodec="pcm_s16le", ac=1, ar="16k")
            .run(capture_stdout=True, capture_stderr=True)
    except ffmpeg.Error as e:
        print(e.stderr, file=sys.stderr)