如何使用 asyncio/aiohttp 确定最佳缓冲区大小

Question

在 python 中使用 asyncio 时，我们如何确定 read() 的最佳参数？ 12 个字节？ 100 字节？

async with self._session.get(url, headers=headers) as response:
    chunk_size = 12
    result = ''

    while True:
       chunk = await response.content.read(chunk_size)
          if not chunk:
              break
          elif isinstance(chunk, (bytes, bytearray)):
              data = chunk.decode('utf8')
               result += data

Answer 1

How do we decide the optimal parameter for read() when working with asyncio in python? 12 bytes? 100 bytes?

您可以安全地选择一个比这个大得多的数字。如果数字太小（例如只有 1），您的循环将包含对 StreamReader.read 的多次调用，每个调用都会带来固定的开销 - 它必须检查缓冲区中是否有内容，并且 return 的一部分并更新剩余的缓冲区，或者等待新的东西到达。另一方面，如果请求的大小过大，理论上可能需要不必要的大分配。但是由于 StreamReader.read 允许 return 比指定的数据少，它永远不会 return 比内部缓冲区 (64 KiB by default) 大的块，所以这不是问题.

总结：任何大于 1024 左右的数字都可以，因为它足够大，可以避免不必要的函数调用次数。在大多数情况下，请求超过 65536 与请求 65536 相同。当我不关心绝对最佳性能时，我倾向于请求 1024 字节（调试时更小的块更容易在眼睛上），而更大的值，如 16384，当我做。数字不一定是 2 的幂，顺便说一句，这只是低级语言的约定。

在专门处理 aiohttp 流时，您可以调用 readany，一种只 return 提供任何可用数据的方法，如果没有可用数据，则等待一些数据到达并 return就是这样。如果你正在处理 aiohttp 流，这可能是最好的选择，因为它只为你提供来自内部缓冲区的数据，而不必担心它的大小。

如何使用 asyncio/aiohttp 确定最佳缓冲区大小

How to determine the optimal amount of buffer size with asyncio/aiohttp

python

python-asyncio

aiohttp