Python 合并 CSV,删除 header 并删除空格

Python combine CSVs, remove header and remove blanks

我对 Python 非常陌生并试图弄清楚以下内容:

我有多个 CSV 文件(月度文件),我想将它们合并成一个年度文件。每月文件都有 header,所以我试图保留第一个 header 并删除其余的。我使用下面的脚本完成了这个,但是每个月之间有 10 个空行

有谁知道我可以添加什么来删除空白行?

import shutil
import glob


#import csv files from folder
path = r'data/US/market/merged_data'
allFiles = glob.glob(path + "/*.csv")
allFiles.sort()  # glob lacks reliable ordering, so impose your own if output order matters
with open('someoutputfile.csv', 'wb') as outfile:
    for i, fname in enumerate(allFiles):
        with open(fname, 'rb') as infile:
            if i != 0:
                infile.readline()  # Throw away header on all but first file
            # Block copy rest of file from input to output without parsing
            shutil.copyfileobj(infile, outfile)
            print(fname + " has been imported.")     

提前致谢!

假设数据集不超过你的记忆,我建议阅读 pandas 中的每个文件,连接数据帧并从那里过滤。空白行可能会显示为 nan。

import pandas as pd
import glob
path = r'data/US/market/merged_data'
allFiles = glob.glob(path + "/*.csv")
allFiles.sort()
df = pd.Dataframe()
for i, fname in enumerate(allFiles):
    #append data to existing dataframe
    df = df.append(pd.read(fname), ignore_index = True)
#hopefully, this will drop blank rows
df = df.dropna(how = 'all')
#write to file
df.to_csv('someoutputfile.csv')