在 Flask 应用程序中提供 pandas 数据帧的 Zip 文件

Serving a Zip file of pandas data frames within a flask app

我正在尝试 return 我的 Flask 应用程序中的 pandas 数据帧的 zip。我的观点之一是 return 来自 serve_csv 函数的输出。这是我原来的 serve_csv 函数,它成功下载了指定的 csv。

def serve_csv(dataframe,filename):
    buffer = StringIO.StringIO()
    dataframe.to_csv(buffer, encoding='utf-8', index=False)
    buffer.seek(0)
    return send_file(buffer,
         attachment_filename=filename,
         mimetype='text/csv')

我正在尝试将其更改为 serve_zip 函数,该函数采用 pandas 数据帧列表和 return 相应 csv 文件的 zip。但是,我收到一条错误消息,指出 Nonetype 类型的对象没有 len。我猜这与我尝试写入缓冲区的方式有关,但在阅读文档后,我不确定如何修复它。这是我当前的功能:

def serve_zip(data_list,filename):
    '''data_list: a list of pandas data frames
    filename'''
    zipped_file = StringIO.StringIO()
    with zipfile.ZipFile(zipped_file, 'w') as zip:
        for i, dataframe in enumerate(data_list):
            print type(dataframe.to_csv(zipped_file, encoding='utf-8', index=False))
        zip.writestr(filename, dataframe.to_csv(zipped_file, encoding='utf-8', index=False))
    zipped_file.seek(0)
    return send_file(zipped_file,
         attachment_filename=filename,
         mimetype='application/octet-stream')

我的堆栈跟踪:

Full traceback: 
127.0.0.1 - - [17/Feb/2015 15:57:21] "POST /part2/ HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/.deps/Flask-0.10.1-py2.7.egg/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/webapp.py", line 110, in part2
return serve_zip([table3, agg_table], 'my_file.csv')
  File "/private/var/folders/f4/qr09tm_169n4b9xyjsrjv8680000gn/T/tmpmQ95cJ/webapp.py", line 61, in serve_csv
zip.writestr(filename, dataframe.to_csv(zipped_file, encoding='utf-8', index=False))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 1216, in writestr
zinfo.file_size = len(bytes)            # Uncompressed size
TypeError: object of type 'NoneType' has no len()
127.0.0.1 - - [17/Feb/2015 15:57:21] code 400, message Bad request version ('RTSP/1.0')
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /info?txtAirPlay&txtRAOP RTSP/1.0" 400 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=style.css HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=jquery.js HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=debugger.js HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=ubuntu.ttf HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=console.png HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=source.png HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2015 15:57:21] "GET /part2/?__debugger__=yes&cmd=resource&f=console.png HTTP/1.1" 200 -

看起来 dataframe.to_csv 如果提供了缓冲区对象,则不会 return 字符串,而是将 CSV 数据写入缓冲区。这不是您想要做的,因为您希望缓冲区中的数据是一个有效的 zip 文件。相反,传入 None:

zip.writestr(filename, dataframe.to_csv(None, encoding='utf-8', index=False))

这样 zip 对象将压缩 CSV 字符串并将其添加到 zip 存档(您通过 StringIO 在内存中缓冲)。