Python3.8 - FastAPI 和无服务器 (AWS Lambda) - 无法处理发送到 api 端点的文件
Python3.8 - FastAPI and Serverless (AWS Lambda) - Unable to process files sent to api endpoint
几个月来,我一直在通过 AWS Lambda 函数将 FastAPI 与 Serverless 结合使用,并且运行良好。
我正在创建一个新的 api 端点,它需要发送一个文件。
它在我的本地机器上使用时完美运行,但是当我部署到 AWS Lambda 时,当我尝试调用我的端点时出现以下错误,使用与本地运行的完全相同的文件。我现在正在这样做作为通过招摇 UI 的测试并且我的无服务器或我的本地机器之间没有任何变化 "machine" 代码是 运行 on.
你知道发生了什么事吗?
Python 3.8
FastAPI 0.54.1
我的代码:
from fastapi import FastAPI, File, UploadFile
import pandas as pd
app = FastAPI()
@app.post('/process_data_import_quote_file')
def process_data_import_quote_file(file: UploadFile = File(...)): # same error if I put bytes instead of UploadFile
file = file.file.read()
print(f"file {file}")
quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()
最后一行失败
我试过打印文件,当我比较打印的数据和我在本地读取的数据时,它是不同的。我发誓这是我在 2 上使用的同一个文件,所以我不知道有什么可以解释的?
这是一个非常基本的 excel 文件,没有什么特别之处。
[ERROR] 2020-05-07T14:25:17.878Z 25ff37a5-e313-4db5-8763-1227e8244457 Exception in ASGI application
Traceback (most recent call last):
File "/var/task/mangum/protocols/http.py", line 39, in run
await app(self.scope, self.receive, self.send)
File "/var/task/fastapi/applications.py", line 149, in __call__
await super().__call__(scope, receive, send)
File "/var/task/starlette/applications.py", line 102, in __call__
await self.middleware_stack(scope, receive, send)
File "/var/task/starlette/middleware/errors.py", line 181, in __call__
raise exc from None
File "/var/task/starlette/middleware/errors.py", line 159, in __call__
await self.app(scope, receive, _send)
File "/var/task/starlette/exceptions.py", line 82, in __call__
raise exc from None
File "/var/task/starlette/exceptions.py", line 71, in __call__
await self.app(scope, receive, sender)
File "/var/task/starlette/routing.py", line 550, in __call__
await route.handle(scope, receive, send)
File "/var/task/starlette/routing.py", line 227, in handle
await self.app(scope, receive, send)
File "/var/task/starlette/routing.py", line 41, in app
response = await func(request)
File "/var/task/fastapi/routing.py", line 196, in app
raw_response = await run_endpoint_function(
File "/var/task/fastapi/routing.py", line 150, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "/var/task/starlette/concurrency.py", line 34, in run_in_threadpool
return await loop.run_in_executor(None, func, *args)
File "/var/lang/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/var/task/app/quote/processing.py", line 100, in process_data_import_quote_file
quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()
File "/var/task/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/var/task/pandas/io/excel/_base.py", line 821, in __init__
self._reader = self._engines[engine](self._io)
File "/var/task/pandas/io/excel/_xlrd.py", line 21, in __init__
super().__init__(filepath_or_buffer)
File "/var/task/pandas/io/excel/_base.py", line 355, in __init__
self.book = self.load_workbook(BytesIO(filepath_or_buffer))
File "/var/task/pandas/io/excel/_xlrd.py", line 34, in load_workbook
return open_workbook(file_contents=data)
File "/var/task/xlrd/__init__.py", line 115, in open_workbook
zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
File "/var/lang/lib/python3.8/zipfile.py", line 1269, in __init__
self._RealGetContents()
File "/var/lang/lib/python3.8/zipfile.py", line 1354, in _RealGetContents
fp.seek(self.start_dir, 0)
ValueError: negative seek value -62703616
这是由于 AWS API 网关。
我不得不继续在 API 网关中允许 multipart/form-data
并使用 file = BytesIO(file).read()
更正以便能够正确使用文件流。
这通常是由二进制数据在 API 网关中转换为文本引起的。要解决此问题,请将以下内容添加到您的 serverless.yml
文件的 provider
部分下:
apiGateway:
binaryMediaTypes:
- 'multipart/form-data'
几个月来,我一直在通过 AWS Lambda 函数将 FastAPI 与 Serverless 结合使用,并且运行良好。
我正在创建一个新的 api 端点,它需要发送一个文件。
它在我的本地机器上使用时完美运行,但是当我部署到 AWS Lambda 时,当我尝试调用我的端点时出现以下错误,使用与本地运行的完全相同的文件。我现在正在这样做作为通过招摇 UI 的测试并且我的无服务器或我的本地机器之间没有任何变化 "machine" 代码是 运行 on.
你知道发生了什么事吗?
Python 3.8 FastAPI 0.54.1
我的代码:
from fastapi import FastAPI, File, UploadFile
import pandas as pd
app = FastAPI()
@app.post('/process_data_import_quote_file')
def process_data_import_quote_file(file: UploadFile = File(...)): # same error if I put bytes instead of UploadFile
file = file.file.read()
print(f"file {file}")
quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()
最后一行失败
我试过打印文件,当我比较打印的数据和我在本地读取的数据时,它是不同的。我发誓这是我在 2 上使用的同一个文件,所以我不知道有什么可以解释的? 这是一个非常基本的 excel 文件,没有什么特别之处。
[ERROR] 2020-05-07T14:25:17.878Z 25ff37a5-e313-4db5-8763-1227e8244457 Exception in ASGI application
Traceback (most recent call last):
File "/var/task/mangum/protocols/http.py", line 39, in run
await app(self.scope, self.receive, self.send)
File "/var/task/fastapi/applications.py", line 149, in __call__
await super().__call__(scope, receive, send)
File "/var/task/starlette/applications.py", line 102, in __call__
await self.middleware_stack(scope, receive, send)
File "/var/task/starlette/middleware/errors.py", line 181, in __call__
raise exc from None
File "/var/task/starlette/middleware/errors.py", line 159, in __call__
await self.app(scope, receive, _send)
File "/var/task/starlette/exceptions.py", line 82, in __call__
raise exc from None
File "/var/task/starlette/exceptions.py", line 71, in __call__
await self.app(scope, receive, sender)
File "/var/task/starlette/routing.py", line 550, in __call__
await route.handle(scope, receive, send)
File "/var/task/starlette/routing.py", line 227, in handle
await self.app(scope, receive, send)
File "/var/task/starlette/routing.py", line 41, in app
response = await func(request)
File "/var/task/fastapi/routing.py", line 196, in app
raw_response = await run_endpoint_function(
File "/var/task/fastapi/routing.py", line 150, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "/var/task/starlette/concurrency.py", line 34, in run_in_threadpool
return await loop.run_in_executor(None, func, *args)
File "/var/lang/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/var/task/app/quote/processing.py", line 100, in process_data_import_quote_file
quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()
File "/var/task/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/var/task/pandas/io/excel/_base.py", line 821, in __init__
self._reader = self._engines[engine](self._io)
File "/var/task/pandas/io/excel/_xlrd.py", line 21, in __init__
super().__init__(filepath_or_buffer)
File "/var/task/pandas/io/excel/_base.py", line 355, in __init__
self.book = self.load_workbook(BytesIO(filepath_or_buffer))
File "/var/task/pandas/io/excel/_xlrd.py", line 34, in load_workbook
return open_workbook(file_contents=data)
File "/var/task/xlrd/__init__.py", line 115, in open_workbook
zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
File "/var/lang/lib/python3.8/zipfile.py", line 1269, in __init__
self._RealGetContents()
File "/var/lang/lib/python3.8/zipfile.py", line 1354, in _RealGetContents
fp.seek(self.start_dir, 0)
ValueError: negative seek value -62703616
这是由于 AWS API 网关。
我不得不继续在 API 网关中允许 multipart/form-data
并使用 file = BytesIO(file).read()
更正以便能够正确使用文件流。
这通常是由二进制数据在 API 网关中转换为文本引起的。要解决此问题,请将以下内容添加到您的 serverless.yml
文件的 provider
部分下:
apiGateway:
binaryMediaTypes:
- 'multipart/form-data'