"Error: List index out of range" Over a list of 952 xlsx files, how edit and then save as csv

"Error: List index out of range" Over a list of 952 xlsx files, how edit and then save as csv

目标是合并 952 个 excel 文件的 sheet1 和 sheet2 中的重要信息。然后根据单元格值将它们保存为路径中的 csvs 作为名称。感谢 Whosebug 社区,这主要是工作。现在,它给出错误:“列表索引超出范围”。这发生在中途点附近,特别是正确保存的 475 到 516 个文件。

有人能为整个列表做这个吗?

# 
# 
# 
# 
# 
import glob
import pandas as pd

excel_files = glob.glob('data1/*.xlsx')
path = Path('data2')
for excel in excel_files:
    df1 = pd.read_excel(excel, sheet_name=0, dtype=str, index_col=None)
    df2 = pd.read_excel(excel, sheet_name=1, dtype=str, index_col=None)
    i = df1.iat[0,1]
    j = df1.iat[0,15]
    df2.rename(columns={'Date':'Date','Sales':i+j}, inplace=True)
    df2.columns=df2.columns.str.replace('(','') 
    df2.columns=df2.columns.str.replace('/','')
    df2.columns=df2.columns.str.replace(')','')
    df2.columns=df2.columns.str.replace(' ','-')
    df2.columns=df2.columns.str.replace('<','-')
    df2.columns=df2.columns.str.replace('>','-')
    k = df2.columns[1]
    l = (k)[:19]
    m = l + '.csv'
    df2.to_csv(path/m, encoding='utf-8', index=False)

->编辑如下: 根据要求进行堆栈跟踪。至少谢谢你看一看。

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-10-3f727a40755a> in <module>
     23 for excel in excel_files:
     24     df1 = pd.read_excel(excel, sheet_name=0, dtype=str, index_col=None)
---> 25     df2 = pd.read_excel(excel, sheet_name=1, dtype=str, index_col=None)
     26     i = df1.iat[0,1]
     27     j = df1.iat[0,15]

~/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    332         convert_float=convert_float,
    333         mangle_dupe_cols=mangle_dupe_cols,
--> 334         **kwds,
    335     )
    336 

~/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in parse(self, sheet_name, header, names, index_col, usecols, squeeze, converters, true_values, false_values, skiprows, nrows, na_values, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    886             convert_float=convert_float,
    887             mangle_dupe_cols=mangle_dupe_cols,
--> 888             **kwds,
    889         )
    890 

~/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in parse(self, sheet_name, header, names, index_col, usecols, squeeze, dtype, true_values, false_values, skiprows, nrows, na_values, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    439                 sheet = self.get_sheet_by_name(asheetname)
    440             else:  # assume an integer if not a string
--> 441                 sheet = self.get_sheet_by_index(asheetname)
    442 
    443             data = self.get_sheet_data(sheet, convert_float)

~/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py in get_sheet_by_index(self, index)
     44 
     45     def get_sheet_by_index(self, index):
---> 46         return self.book.sheet_by_index(index)
     47 
     48     def get_sheet_data(self, sheet, convert_float):

~/anaconda3/lib/python3.7/site-packages/xlrd/book.py in sheet_by_index(self, sheetx)
    464         :returns: A :class:`~xlrd.sheet.Sheet`.
    465         """
--> 466         return self._sheet_list[sheetx] or self.get_sheet(sheetx)
    467 
    468     def sheet_by_name(self, sheet_name):

IndexError: list index out of range

Chris's 解决了问题:

It means your particular excel workbook only has one sheet unlike the others and read_excel fails if parameter sheet_name is 1. If you want to handle that case, you need to wrap your pandas.read_excel call inside a try / except clause.