aiohttp。从 url 保存的巨大 csv 未满

Question

我有下载存储在 .gz 档案中的巨大 csv 的代码。

import asyncio
import re
import zlib
import aiohttp
from aiohttp import ClientTimeout
from aiohttp.client_exceptions import InvalidURL

timeout = ClientTimeout(total=600)



async def download(link, session):
    out_file_path = link.split("/")[-1][:-3]
    try:
        async with sem, session.get(
                'http://111.11.111.111/test/' + link) as resp:
            d = zlib.decompressobj(zlib.MAX_WBITS | 32)
            with open(out, 'wb') as file:
                async for data in resp.content.iter_chunks():
                    chunk = d.decompress(data)
                    file.write(chunk)
                return True

    except InvalidURL as invalid_url:
        ...
    except TimeoutError as e:
        ...


async def main():
    links = ['test/1.csv.gz']
    sem = asyncio.Semaphore(10)
    async with aiohttp.ClientSession(
            auth=aiohttp.BasicAuth(
                'test',
                'test'),
            timeout=timeout
    ) as session:
        tasks = (download(
            link=link,
            session=session,
            sem=sem
        ) for link in links)
        results = await asyncio.gather(*tasks)
        return results


asyncio.run(main())

这段代码工作完美，但是，我所有下载的文件只有 100mb。我下载的所有档案都有更多的内容长度。

我如何修复它并能够下载完整数据？

Answer 1

通过以下方式解决我的问题：

            async with downloading_queue, aiohttp.ClientSession(
                auth=aiohttp.BasicAuth(
                    self.config['log'],
                    self.config['pwd']),
                timeout=CLIENT_TIMEOUT
        ).get(url=url) as resp:
            file = BytesIO(await resp.content.read())
            with gzip.open(file, 'rt') as decompressed_file:
                with open(out_file_path, 'w') as outfile:
                    shutil.copyfileobj(decompressed_file, outfile)

aiohttp。从 url 保存的巨大 csv 未满

aiohttp. Saved huge csv from url is not full

python

csv

gzip

python-asyncio

aiohttp