使用 django 来下载附加了一些数据的大 zip 文件

use django to serve downloading big zip file with some data appended

我有一个如下所示的视图片段,它从请求中获取一个 zip 文件名,我想在 zip 文件结束后附加一些字符串 sign

@require_GET
def download(request):
    ... skip
    response = HttpResponse(readFile(abs_path, sign),  content_type='application/zip')
    response['Content-Length'] = os.path.getsize(abs_path) + len(sign)
    response['Content-Disposition'] = 'attachment; filename=%s' % filename
    return response

readFile函数如下:

def readFile(fn, sign, buf_size=1024<<5):
    f = open(fn, "rb")
    logger.debug("started reading %s" % fn)
    while True:
        c = f.read(buf_size)
        if c:
            yield c
        else:
            break
    logger.debug("finished reading %s" % fn)
    f.close()
    yield sign

使用 runserver 模式时它工作正常,但当我使用 uwsgi + nginxapache + mod_wsgi.

时无法处理大 zip 文件

似乎超时了,因为读取一个大文件需要很长时间。

我不明白为什么我使用yield但是浏览器在整个文件读取完成后开始下载。(因为我看到浏览器等待直到日志finished reading %s出现)

它不应该在读取第一个块后立即开始下载吗?

有没有更好的方法来提供文件下载功能,我需要在文件后附加一个动态字符串?

默认情况下,Django 不允许流式响应,因此它会缓冲整个响应。否则,中间件将无法像现在这样运行。

要获得您正在寻找的行为,您需要改用 StreamingHttpResponse

来自 docs 的用法示例:

import csv

from django.utils.six.moves import range
from django.http import StreamingHttpResponse

class Echo(object):
    """An object that implements just the write method of the file-like
    interface.
    """
    def write(self, value):
        """Write the value by returning it, instead of storing in a buffer."""
        return value

def some_streaming_csv_view(request):
    """A view that streams a large CSV file."""
    # Generate a sequence of rows. The range is based on the maximum number of
    # rows that can be handled by a single sheet in most spreadsheet
    # applications.
    rows = (["Row {}".format(idx), str(idx)] for idx in range(65536))
    pseudo_buffer = Echo()
    writer = csv.writer(pseudo_buffer)
    response = StreamingHttpResponse((writer.writerow(row) for row in rows),
                                     content_type="text/csv")
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    return response

这是 StreamingHttpResponse 而不是 HttpResponse 的用例。

最好用FileRespose,是StreamingHttpResponse的子类,针对二进制文件进行了优化。如果 wsgi 服务器提供,它使用 wsgi.file_wrapper ,否则它会将文件分成小块流出。

import os
from django.http import FileResponse
from django.core.servers.basehttp import FileWrapper


def download_file(request):
    _file = '/folder/my_file.zip'
    filename = os.path.basename(_file)
    response = FileResponse(FileWrapper(file(filename, 'rb')), content_type='application/x-zip-compressed')
    response['Content-Disposition'] = "attachment; filename=%s" % _file
    return response