如何在不将文件保存到磁盘的情况下处理来自多个客户端的 FastAPI 中的文件

How to process files in FastAPI from multiple clients without saving the files to disk

我正在使用 FastAPI 创建一个 API 从移动应用程序接收小音频文件。在此 API 中,我对信号进行了处理,并且能够在对该声音进行分类后 return 做出响应。最终目标是将分类返回给用户。

这是我目前所做的:

@app.post("/predict")

def predict(file: UploadFile = File(...)):   # Upload the wav audio sent from the mobile app user

 with open(name_file, "wb") as buffer:
        shutil.copyfileobj(file.file, buffer)  #creating a file with the received audio data
...

prev= test.my_classification_module(name_file) #some processing and the goal response in PREV variable

my_classification_module() 中,我有这个 :

X, sr = librosa.load(sound_file)

我想避免创建要分类为 librosa 的文件。我想用一个临时文件来做到这一点,而不会不必要地使用内存,并避免在多个用户使用该应用程序时文件重叠。

如果您的函数直接支持 file-like object, you could use the .file attribute of UploadFile, e.g., file.file (which is a SpooledTemporaryFile instance), or if your function requires the file in bytes format, use the .read() async method (see the documentation). If you wish to keep your route defined with def instead of async def (have a look at for more info on def vs async def), you could use the .read() method of the file-like 对象,例如 file.file.read()

更新 - 如何解决 File contains data in an unknown format 错误

  1. 确保音频文件没有损坏。比方说,如果您保存它并用媒体播放器打开它,声音文件会播放吗?

  2. 确保安装了最新版本的 librosa 模块。

  3. 尝试安装 ffmpeg 并将其添加到系统路径,按照建议 here

  4. 如所述documentation, librosa.load() can take a file-like object as an alternative to a file path - thus, using file.file or file.file._file should normally be fine (as per the documentation, _file attribute is either an io.BytesIO or io.TextIOWrapper 对象...).

    但是,如文档中所述here and here, as well as in this github discussion, you could also use the soundfile module to read audio from file-like对象。示例:

    import soundfile as sf 
    data, samplerate = sf.read(file.file)
    
  5. 您还可以将上传文件的文件 contents 写入 BytesIO 流,然后将其传递给 sf.read()librosa.load():

    from io import BytesIO
    contents = file.file.read()
    buffer = BytesIO(contents)
    data, samplerate = librosa.load(buffer)  # ussing librosa module
    #data, samplerate = sf.read(buffer)      # using soundfile module
    buffer.close()
    
  6. 另一种选择是将文件 contents 保存到 NamedTemporaryFile,它“在文件系统中有一个可见的名称”,“可用于打开文件”。完成后,您可以使用 remove()unlink() 方法手动删除它。

    from tempfile import NamedTemporaryFile
    import os
    contents = file.file.read()
    temp = NamedTemporaryFile(delete=False)
    try:
        with temp as f:
            f.write(contents);
        data, samplerate = librosa.load(temp.name)   # ussing librosa module
        #data, samplerate = sf.read(temp.name)       # using soundfile module
    finally:
        temp.close()
        os.unlink(temp.name)