如何有效地遍历 Python 中选定的 Excel 工作表并将它们附加到数据框中?
How to efficiently iterate through selected Excel sheets in Python and append them into a Data Frame?
代替手动输入Excel sheets参数如下:
import pandas as pd
df1 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = "Europe",usecols=[1,2,4,6],header=4) #reads sheet "Europe", selected columns and skips first 4 rows
df1["Continent"]= "Europe" #adds a new column with sheet name
df1=pd.DataFrame(df1) #creates df
df1.columns=["ID", "Product", "Quantity","Price","Continent"] #renames columns in df
df2 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = "North America",usecols=[1,2,4,6],header=4)
df2["Continent"]= "North America"
df2=pd.DataFrame(df2)
df2.columns=["ID", "Product", "Quantity","Price","Continent"]
df = pd.concat([df1, df2]) #concats the dfs
我想自动遍历 sheet 并将所有 sheet 中的数据放入数据帧中。
我尝试了类似下面的方法,但是它没有完成工作,因为循环只从列表中的最后一个 sheet 获取数据:
import pandas as pd
sheets=["Europe","North America"]
for i in sheets:
dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = i,usecols=[1,2,4,6],header=4) #read Excel
dataset["Continent"]= i #adds a new column with sheet name
dataset = pd.DataFrame(dataset) #creates df
dataset.columns=["ID", "Product", "Quantity","Price","Continent"] #renames columns in df
df= dataset.append(dataset) #this should append data from sheets into a single df
你有什么想法吗?我该如何解决这个问题?
非常感谢
当数据集已经是数据框时,无需创建新的数据框。
import pandas as pd
sheets=["Europe","North America"]
df_list=[]
for i in sheets:
dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = i,usecols=[1,2,4,6],header=4) #read Excel
dataset["Continent"]= i #adds a new column with sheet name
df_list.append(dataset)
df=pd.concat(df_list)
代替手动输入Excel sheets参数如下:
import pandas as pd
df1 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = "Europe",usecols=[1,2,4,6],header=4) #reads sheet "Europe", selected columns and skips first 4 rows
df1["Continent"]= "Europe" #adds a new column with sheet name
df1=pd.DataFrame(df1) #creates df
df1.columns=["ID", "Product", "Quantity","Price","Continent"] #renames columns in df
df2 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = "North America",usecols=[1,2,4,6],header=4)
df2["Continent"]= "North America"
df2=pd.DataFrame(df2)
df2.columns=["ID", "Product", "Quantity","Price","Continent"]
df = pd.concat([df1, df2]) #concats the dfs
我想自动遍历 sheet 并将所有 sheet 中的数据放入数据帧中。 我尝试了类似下面的方法,但是它没有完成工作,因为循环只从列表中的最后一个 sheet 获取数据:
import pandas as pd
sheets=["Europe","North America"]
for i in sheets:
dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = i,usecols=[1,2,4,6],header=4) #read Excel
dataset["Continent"]= i #adds a new column with sheet name
dataset = pd.DataFrame(dataset) #creates df
dataset.columns=["ID", "Product", "Quantity","Price","Continent"] #renames columns in df
df= dataset.append(dataset) #this should append data from sheets into a single df
你有什么想法吗?我该如何解决这个问题?
非常感谢
当数据集已经是数据框时,无需创建新的数据框。
import pandas as pd
sheets=["Europe","North America"]
df_list=[]
for i in sheets:
dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = i,usecols=[1,2,4,6],header=4) #read Excel
dataset["Continent"]= i #adds a new column with sheet name
df_list.append(dataset)
df=pd.concat(df_list)