将文件从 Google 桶加载到 PyDub AudioSegment

Loading a File from Google Bucket into PyDub AudioSegment

我一直在尝试将存在于 Google 存储桶中的 .mp3 文件加载到 Pydub 中。这是我的代码

f = io.BytesIO()
storage_client = storage.Client()
bucket_name="my-bucket"
bucket = storage_client.get_bucket(bucket_name) 
blob = bucket.blob(file_path)
blob.download_to_file(f)

currentAudio=AudioSegment.from_mp3(f)

这是文件的路径 https://storage.cloud.google.com/written-audio-files/10530c70-52af-4ed7-a2ad-146738141b41.mp3

所以这个文件存在并且也正确下载了。当最后一行

currentAudio=AudioSegment.from_mp3(f) 

执行时,出现 FFMPEG 错误。

pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

b'ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers\n  built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)\n  configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared\n  libavutil      55. 78.100 / 55. 78.100\n  libavcodec     57.107.100 / 57.107.100\n  libavformat    57. 83.100 / 57. 83.100\n  libavdevice    57. 10.100 / 57. 10.100\n  libavfilter     6.107.100 /  6.107.100\n  libavresample   3.  7.  0 /  3.  7.  0\n  libswscale      4.  8.100 /  4.  8.100\n  libswresample   2.  9.100 /  2.  9.100\n  libpostproc    54.  7.100 / 54.  7.100\n[mp3 @ 0x556a1d4c7820] Failed to read frame size: Could not seek to 1026.\npipe:: Invalid argument\n'

对我来说,当一个 BytesIO 对象被赋予函数 from_mp3 时,它需要一个物理文件而不是一个内存中的 BytesIO 对象。

我在 Google 云功能上执行此操作,我可能不会在其中拥有任何本地存储。另外,保存文件然后从 /tmp

重新加载文件会很耗时

保存到本地文件系统并重新打开文件在我当前正在测试程序的本地计算机上按预期工作。 我们如何规避这个问题?

** 更新** 将下载的文件传递给 ffmpeg 会产生以下输出

ffmpeg -i 10530c70-52af-4ed7-a2ad-146738141b41.mp3 -hide_banner
[mp3 @ 0x561b77222760] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '10530c70-52af-4ed7-a2ad-146738141b41.mp3':
  Duration: 00:00:02.09, start: 0.000000, bitrate: 32 kb/s
    Stream #0:0: Audio: mp3, 24000 Hz, mono, s16p, 32 kb/s
At least one output file must be specified

我认为您的 MP3 文件有问题,我无法播放它并且 FFMPEG 不清楚它是什么文件类型:

$ ffmpeg -i 10530c70-52af-4ed7-a2ad-146738141b41.mp3 -hide_banner
[mp3 @ 0x7fa661000000] Format mp3 detected only with low score of 1, misdetection possible!
[mp3 @ 0x7fa661000000] Failed to read frame size: Could not seek to 58401.
10530c70-52af-4ed7-a2ad-146738141b41.mp3: Invalid argument

将您的文件保存在 tmp 目录中,然后提供 AudioSegment() 的路径。

tmpdir=tempfile.gettempdir() # prints the current temporary directory
tempFilePath=tmpdir+"/"+file_path
storage_client = storage.Client()
bucket_name="my-bucket"
bucket = storage_client.get_bucket(bucket_name)         
blob = bucket.blob(file_path)   
blob.download_to_filename(tempFilePath)     
currentAudio=AudioSegment.from_file(tempFilePath, format="mp3")