使用 Python 将大型 CSV 文件转换为 excel 3

Convert large CSV file to excel using Python 3

这是我将 CSV 文件隐藏到 .xlsx 文件的代码,对于小型 CSV 文件,此代码工作正常,但当我尝试使用更大尺寸的 CSV 文件时,它显示错误。

import os
import glob
import csv
from xlsxwriter.workbook import Workbook

for csvfile in glob.glob(os.path.join('.', 'file.csv')):
    workbook = Workbook(csvfile[:-4] + '.xlsx')
    worksheet = workbook.add_worksheet()
    with open(csvfile, 'r', encoding='utf8') as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                worksheet.write(r, c, col)
    workbook.close()

错误是

File "CsvToExcel.py", line 12, in <module>
for r, row in enumerate(reader):
_csv.Error: field larger than field limit (131072)
Exception ignored in: <bound method Workbook.__del__ of 
<xlsxwriter.workbook.Workbook object at 0x7fff4e731470>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/xlsxwriter/workbook.py", line 
153, in __del__
Exception: Exception caught in workbook destructor. Explicit close() may be 
required for workbook.

在使用大文件时,最好使用 'constant_memory' 来控制内存使用,例如:

workbook = Workbook(csvfile + '.xlsx', {'constant_memory': True}).

参考:xlsxwriter.readthedocs.org/en/latest/working_with_memory.htm‌​l

我用 panda 包找到了新代码,这段代码现在可以正常工作了

import pandas
data = pandas.read_csv('Documents_2/AdvMedcsv.csv') 
data = data.groupby(lambda x: data['research_id'][x]).first() 
writer = pandas.ExcelWriter('Documents_2/AdvMed.xlsx',engine='xlsxwriter')data.to_excel(writer) 
writer.save()