在不阻塞的情况下使用 Tornado 的 data_received 方法有什么特别的要求吗？

Question

我有一个 POST 请求处理程序，需要 streaming data as input and writes it to AWS APIs. The data is sent to AWS using multiple inner requests, which are made via boto3. I believe boto3 is blocking but may release the GIL when doing I/O: it seems to use urllib3.connection internally. So I wrapped it in a call to run_in_executor - 类似于这个缩减代码：

@stream_request_body
class Handler(RequestHandler):
    async def prepare(self):
        self.parser = BufferedParser()

    async def data_received(self, chunk):
        complete_part = self.parser.receive(chunk)
        if complete_part:
            await IOLoop.current().run_in_executor(
                None, self.send_via_boto, complete_part)

    async def post(self):
        self.set_header('Content-Type', 'text/plain')
        self.write("OK")

我的问题是：等待的 send_via_boto 调用是否会阻止客户端上传下一个块？我是否需要实现更高级的东西，或者这应该已经是非阻塞的了？

Answer 1

"block the client from uploading the next chunk" — 客户端不直接将数据上传到您的应用程序，而是上传到 TCP 套接字。该套接字有一定的大小，即缓冲区，所以如果缓冲区已满，客户端将等待直到它被清空，然后继续上传。在 Tornado 的帮助下，您的应用程序从这个 TCP 套接字缓冲区中读取数据，并用读取的部分清空它。将数据块发送到 AWS 的过程不会阻止客户端将数据上传到 TCP 套接字，即使您以阻塞方式将数据发送到 AWS（即没有 run_in_executor，但您会 阻止您的服务器 为其他请求提供服务）。如果您将数据发送到 AWS 的速度比客户端上传的速度慢，那么您的应用程序就会成为瓶颈，并且会 prevent（这在技术上与 blocking 不同）一个客户上传更多。

在不阻塞的情况下使用 Tornado 的 data_received 方法有什么特别的要求吗？

Anything special required to use Tornado's data_received method without blocking?

asynchronous

tornado

nonblocking

python-3.x