Pandas 读取 Excel 函数将索引转换为列表

Pandas Read Excel function convert index to list

## Summary: Analyze the data in each sheet and get the result
def analyze_data(project, sheet):
    print(project_dict[project],'****'+sheet)

    ## Get data with specific finding type in validation sheet
    sheet_df = pd.read_excel(project_dict[project],sheet, na_values=['NA'])
    print(sheet_df['Feedback Report']=='S.No')
    # Get index of tables
    242 idx = sheet_df[sheet_df['Feedback Report']=='S.No'].index.tolist()[0]
    243 head = idx - 1

    245 header_df = sheet_df.iloc[0:head,:]
    246 sheet_df = sheet_df.iloc[idx:,:]


    ## Replace the header
    header = sheet_df.iloc[0]
    sheet_df.columns = header.tolist()
    sheet_df = sheet_df[1:]

    ####################################
    ## Get data from the time period 

以上代码不是我写的,我应该为它制作一个完整的windows可执行文件。我无法理解第 242 行中的代码试图做什么。

Exception in Tkinter callback
    Traceback (most recent call last):
      File 37-32\lib\tkinter\__init__.py", line 1702, in __call__
        return self.func(*args)
      File QA_Review_Reporting.py", line 751, in sync
        report.read(project_dict)
      File reports.py", line 705, in read
        process()
      File reports.py", line 749, in process
        get_valid_type(project)
      File reports.py", line 185, in get_valid_type
        counts = analyze_data(project, item)
      File reports.py", line 242, in analyze_data
        idx = sheet_df[sheet_df['Feedback Report']=='S.No'].index.tolist()[0]
    IndexError: list index out of range

正如我在评论中提到的,第 242 行正在将数据帧 sheet_df 过滤到 'Feedback Report' 列值为 'S.No' 的行。然后它 return 将过滤后的 sheet_df 数据帧的相应索引添加到列表中,并通过 [0].

获取该列表中的第一个元素

例如:

sheet_df = pd.DataFrame([['No', 1, 2, 3], ['S.No', 4, 5, 6], ['S.No', 7, 8, 9], ['Yes', 10, 11, 12]], columns=['Feedback Report', 'Val 1', 'Val 2', 'Val 3'])

产生:

  Feedback Report  Val 1  Val 2  Val 3
0              No      1      2      3
1            S.No      4      5      6
2            S.No      7      8      9
3             Yes     10     11     12

通过 sheet_df[sheet_df['Feedback Report']=='S.No'] 过滤数据帧将 return:

  Feedback Report  Val 1  Val 2  Val 3
1            S.No      4      5      6
2            S.No      7      8      9

然后获取索引并发送 tolist():

[1, 2]

最后,取第一个元素通过[0]到return:

1