在 Pandas 中循环清理多个文档并将它们保存到一本书中

cleaning multiple documents and saving them to one book with a loop in Pandas

我有将近 30 个相同格式的 xlsx 文件的列表。我有一个有效的数据清理代码,我想清理它们并将它们保存在单独的 sheet 中,放在一本书中。我认为循环最适合完成这项工作,但缺少一些东西。我见过一些将多个 sheet 保存到工作簿的函数,但我希望 read_excel、清理数据框、保存到 sheet 并删除数据框。正在发生的事情是,它只是 returns 新 excel 文档列表中的最后一个 sheet。

BOX = [
"aa1",
"aa2",
"aa3"]

for B in BOX:

    filename = B+".xls"

    #create data frame
    BDF = pd.read_excel(r'C:\Projects\BOXES\' + filename)
    #clean data frame
    BDF = BDF.dropna(how="all")
    BDF['Total Cost'] = BDF['Total Cost'].str.replace('.', '')
    BDF.columns = ['LVL', 'PN', 'Leadtime', 'Description', 'Ext QTY']
    BDF.PN = BDF.PN.str.strip()

    sheetname=B
    #save to sheet
    with pd.ExcelWriter(r'C:\Projects\BOXES\BOXED.xlsx') as writer:
        BDF.to_excel(writer, sheet_name=B, index=False)
    #delete data frame before repeating 
    del(BDF)
    del(B)

您应该将 with 放在循环之外,因为它会在每个 for 循环中自动打开和关闭文件。以下应该有效:

BOX = [
"aa1",
"aa2",
"aa3"]

with pd.ExcelWriter(r'C:\Projects\BOXES\BOXED.xlsx') as writer:

    for B in BOX:

        filename = B+".xls"

        #create data frame
        BDF = pd.read_excel(r'C:\Projects\BOXES\' + filename)
        #clean data frame
        BDF = BDF.dropna(how="all")
        BDF['Total Cost'] = BDF['Total Cost'].str.replace('.', '')
        BDF.columns = ['LVL', 'PN', 'Leadtime', 'Description', 'Ext QTY']
        BDF.PN = BDF.PN.str.strip()

        sheetname=B
        #save to sheet
        BDF.to_excel(writer, sheet_name=B, index=False)
        #delete data frame before repeating 
        del(BDF)
        del(B)