使用 Python 3 将多个 excel 工作簿和工作表导入单个数据框

Using Python 3 to import multiple excel workbooks and sheets into single data frame

我还在学习python。我正在尝试将多个工作簿和所有工作表导入一个数据框中。

这是我目前的情况:

import pandas as pd
import numpy as np

import os #checking the working directory 
print(os.getcwd())

all_data = pd.DataFrame() #creating an empty data frame
for file in glob.glob("*.xls"): #import every file that ends in .xls
    df = pd.read_excel(file)
    all_data = all_data.append(df, ignore_index = True)

all_data.shape #12796 rows with 19 columns # we will have to find a way to check if this is accurate 

我很难找到任何文档来 confirm/explain 无论此代码是否导入每个工作簿中的所有数据表。其中一些文件有 15-20 张

这是我找到 glob 解释的 link:http://pbpython.com/excel-file-combine.html

非常感谢任何和所有建议。我对 R 和 Python 还是很陌生,所以如果你能尽可能详细地解释这一点,我将不胜感激!

您缺少的是导入工作簿中的所有工作表。

import pandas as pd
import numpy as np

import os #checking the working directory 
print(os.getcwd())

all_data = pd.DataFrame() #creating an empty data frame
rows = 0
for file in glob.glob("*.xls"): #import every file that ends in .xls
    # df = pd.read_excel(file).. This will import only first sheet
    xls = pd.ExcelFile(file)
    sheets = xls.sheet_names # To get names of all the sheets
    for sheet_name in sheets:
        df = pd.read_excel(file, sheetname=sheet_name)
        rows += df.shape[0]
    all_data = all_data.append(df, ignore_index = True)

print(all_data.shape[0]) # Now you will get all the rows which should be equal to rows
print(rows)