Python3.8 - FastAPI 和无服务器 (AWS Lambda) - 无法处理发送到 api 端点的文件

Python3.8 - FastAPI and Serverless (AWS Lambda) - Unable to process files sent to api endpoint

几个月来,我一直在通过 AWS Lambda 函数将 FastAPI 与 Serverless 结合使用,并且运行良好。

我正在创建一个新的 api 端点,它需要发送一个文件。

它在我的本地机器上使用时完美运行,但是当我部署到 AWS Lambda 时,当我尝试调用我的端点时出现以下错误,使用与本地运行的完全相同的文件。我现在正在这样做作为通过招摇 UI 的测试并且我的无服务器或我的本地机器之间没有任何变化 "machine" 代码是 运行 on.

你知道发生了什么事吗?

Python 3.8 FastAPI 0.54.1

我的代码:

from fastapi import FastAPI, File, UploadFile
import pandas as pd

app = FastAPI()

@app.post('/process_data_import_quote_file')
def process_data_import_quote_file(file: UploadFile = File(...)): # same error if I put bytes instead of UploadFile
    file = file.file.read()
    print(f"file {file}")
    quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()

最后一行失败

我试过打印文件,当我比较打印的数据和我在本地读取的数据时,它是不同的。我发誓这是我在 2 上使用的同一个文件,所以我不知道有什么可以解释的? 这是一个非常基本的 excel 文件,没有什么特别之处。

[ERROR] 2020-05-07T14:25:17.878Z    25ff37a5-e313-4db5-8763-1227e8244457    Exception in ASGI application

Traceback (most recent call last):
  File "/var/task/mangum/protocols/http.py", line 39, in run
    await app(self.scope, self.receive, self.send)
  File "/var/task/fastapi/applications.py", line 149, in __call__
    await super().__call__(scope, receive, send)
  File "/var/task/starlette/applications.py", line 102, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/var/task/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/var/task/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/var/task/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/var/task/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/var/task/starlette/routing.py", line 550, in __call__
    await route.handle(scope, receive, send)
  File "/var/task/starlette/routing.py", line 227, in handle
    await self.app(scope, receive, send)
  File "/var/task/starlette/routing.py", line 41, in app
    response = await func(request)
  File "/var/task/fastapi/routing.py", line 196, in app
    raw_response = await run_endpoint_function(
  File "/var/task/fastapi/routing.py", line 150, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/var/task/starlette/concurrency.py", line 34, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/var/lang/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/var/task/app/quote/processing.py", line 100, in process_data_import_quote_file
    quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()
  File "/var/task/pandas/io/excel/_base.py", line 304, in read_excel
    io = ExcelFile(io, engine=engine)
  File "/var/task/pandas/io/excel/_base.py", line 821, in __init__
    self._reader = self._engines[engine](self._io)
  File "/var/task/pandas/io/excel/_xlrd.py", line 21, in __init__
    super().__init__(filepath_or_buffer)
  File "/var/task/pandas/io/excel/_base.py", line 355, in __init__
    self.book = self.load_workbook(BytesIO(filepath_or_buffer))
  File "/var/task/pandas/io/excel/_xlrd.py", line 34, in load_workbook
    return open_workbook(file_contents=data)
  File "/var/task/xlrd/__init__.py", line 115, in open_workbook
    zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
  File "/var/lang/lib/python3.8/zipfile.py", line 1269, in __init__
    self._RealGetContents()
  File "/var/lang/lib/python3.8/zipfile.py", line 1354, in _RealGetContents
    fp.seek(self.start_dir, 0)
ValueError: negative seek value -62703616

这是由于 AWS API 网关。

我不得不继续在 API 网关中允许 multipart/form-data 并使用 file = BytesIO(file).read() 更正以便能够正确使用文件流。

这通常是由二进制数据在 API 网关中转换为文本引起的。要解决此问题,请将以下内容添加到您的 serverless.yml 文件的 provider 部分下:

apiGateway:
    binaryMediaTypes:
      - 'multipart/form-data'