使用 python 将现有 excel 文件中来自不同 sheet 的新 sheet 中的结果摘要合并
Consolidating Result Summary in new sheet in same excel from different sheets on a existing excel file using python
在每个 运行 之后,我都会得到一个包含测试结果的新 csv 文件,并且我能够将所有 excel 文件合并到一个 excel 文件中,每个 运行作为 sheet 名字 .
为此我使用 xlwt
供其他人参考的代码,用于将不同的 excel 文件添加到合并的 excel 文件中:
book = xlwt.Workbook()
for file in os.listdir(path):
if file.endswith('csv'):
sheet = book.add_sheet(file[:-4])
with open(path + file) as filname:
reader = csv.reader(filname)
i = 0
for row in reader:
for j, each in enumerate(row):
sheet.write(i, j, each)
i += 1
book.save("consolidate_result.xls")
现在我有一个场景,我必须在 Excel.
的新摘要 sheet 中提供不同测试 运行 的摘要
这是我的示例 Excel 文件,其中包含多个 sheet 具有这些数据格式,第一列作为测试名称,第二列作为测试状态,第三列作为该测试的时间值:
Sheet 1 名 Run 1
Test Name Test Status Time Value
Test 1 PASS 00:06:43
Test 2 Fail 00:06:24
Test 3 PASS 00:06:10
Test 4 PASS 00:05:25
Test 5 Fail 00:05:07
Test 6 PASS 00:02:45
Sheet 2 名 Run 2
Test Name Test Status Time Value
Test 1 PASS 00:05:43
Test 2 Fail 00:04:24
Test 3 PASS 00:05:10
Test 4 PASS 00:06:25
Test 5 PASS 00:03:07
Test 6 PASS 00:04:45
Sheet 3 名 Run 3
Test Name Test Status Time Value
Test 1 PASS 00:06:40
Test 2 PASS 00:06:52
Test 3 PASS 00:05:50
Test 4 PASS 00:05:35
Test 5 PASS 00:06:17
Test 6 PASS 00:03:55
我想要实现的是在现有 excel 文件中获得一个新的 sheet 名称,例如 Status 或 consolidation results excel 这种格式
Test Name Test-Status Run 1 Run 2 Run 3
Test 1 Pass 00:06:43 00:05:38 00:06:43
Test 2 Fail 00:06:24 00:05:56 00:06:24
Test 3 Pass 00:06:10 00:06:43 00:06:10
Test 4 Pass 00:05:25 00:05:32 00:05:25
Test 5 Fail 00:05:07 00:05:22 00:05:07
Test 6 Pass 00:02:45 00:07:26 00:02:45
我试图通过使用 pd.ExcelFile(filename)
读取 excel 文件将结果添加到列表中,然后遍历 sheet 并将数据添加到结果列表中
df = pd.read_excel(fname, None)
result=[]
for x in range(len(df.keys())):
dfx=pd.read_excel(xls, xls.sheet_names[x])
result.append(dfx)
有人可以帮我将结果合并到一个新的 sheet 中吗,因为当我使用 writer = pd.ExcelWriter(fname, engine='openpyxl')
和 df.to_excel(writer, sheet_name='Summary')
时,它会覆盖 excel 并添加一个空白sheet 姓名 Summary
。
提前致谢
我建议使用 sheet_name=None
参数创建 Ordered Dictionary of DataFrames
给所有 sheet
s:
path = "file.xlsx"
df = pd.read_excel(path, sheet_name=None)
print (df)
OrderedDict([('Run 1', Test Name Test Status Time Value
0 Test 1 PASS 00:06:43
1 Test 2 Fail 00:06:24
2 Test 3 PASS 00:06:10
3 Test 4 PASS 00:05:25
4 Test 5 Fail 00:05:07
5 Test 6 PASS 00:02:45), ('Run 2', Test Name Test Status Time Value
0 Test 1 PASS 00:05:43
1 Test 2 Fail 00:04:24
2 Test 3 PASS 00:05:10
3 Test 4 PASS 00:06:25
4 Test 5 PASS 00:03:07
5 Test 6 PASS 00:04:45), ('Run 3', Test Name Test Status Time Value
0 Test 1 PASS 00:06:40
1 Test 2 PASS 00:06:52
2 Test 3 PASS 00:05:50
3 Test 4 PASS 00:05:35
4 Test 5 PASS 00:06:17
5 Test 6 PASS 00:03:55)])
然后循环,需要concat
together with align by columns Test Name
and Test Status
, so set_index
。还为不匹配的值添加了 NaN
:
d = {k:v.set_index(['Test Name','Test Status'])['Time Value'] for k, v in df.items()}
result= pd.concat(d, axis=1).reset_index()
print (result)
Test Name Test Status Run 1 Run 2 Run 3
0 Test 1 PASS 00:06:43 00:05:43 00:06:40
1 Test 2 Fail 00:06:24 00:04:24 NaN
2 Test 2 PASS NaN NaN 00:06:52
3 Test 3 PASS 00:06:10 00:05:10 00:05:50
4 Test 4 PASS 00:05:25 00:06:25 00:05:35
5 Test 5 Fail 00:05:07 NaN NaN
6 Test 5 PASS NaN 00:03:07 00:06:17
7 Test 6 PASS 00:02:45 00:04:45 00:03:55
最后追加到新文件中的现有文件 sheet:
#
from openpyxl import load_workbook
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book
result.to_excel(writer, sheet_name = 'Status', index=False)
writer.save()
writer.close()
在每个 运行 之后,我都会得到一个包含测试结果的新 csv 文件,并且我能够将所有 excel 文件合并到一个 excel 文件中,每个 运行作为 sheet 名字 .
为此我使用 xlwt
供其他人参考的代码,用于将不同的 excel 文件添加到合并的 excel 文件中:
book = xlwt.Workbook()
for file in os.listdir(path):
if file.endswith('csv'):
sheet = book.add_sheet(file[:-4])
with open(path + file) as filname:
reader = csv.reader(filname)
i = 0
for row in reader:
for j, each in enumerate(row):
sheet.write(i, j, each)
i += 1
book.save("consolidate_result.xls")
现在我有一个场景,我必须在 Excel.
的新摘要 sheet 中提供不同测试 运行 的摘要这是我的示例 Excel 文件,其中包含多个 sheet 具有这些数据格式,第一列作为测试名称,第二列作为测试状态,第三列作为该测试的时间值:
Sheet 1 名 Run 1
Test Name Test Status Time Value
Test 1 PASS 00:06:43
Test 2 Fail 00:06:24
Test 3 PASS 00:06:10
Test 4 PASS 00:05:25
Test 5 Fail 00:05:07
Test 6 PASS 00:02:45
Sheet 2 名 Run 2
Test Name Test Status Time Value
Test 1 PASS 00:05:43
Test 2 Fail 00:04:24
Test 3 PASS 00:05:10
Test 4 PASS 00:06:25
Test 5 PASS 00:03:07
Test 6 PASS 00:04:45
Sheet 3 名 Run 3
Test Name Test Status Time Value
Test 1 PASS 00:06:40
Test 2 PASS 00:06:52
Test 3 PASS 00:05:50
Test 4 PASS 00:05:35
Test 5 PASS 00:06:17
Test 6 PASS 00:03:55
我想要实现的是在现有 excel 文件中获得一个新的 sheet 名称,例如 Status 或 consolidation results excel 这种格式
Test Name Test-Status Run 1 Run 2 Run 3
Test 1 Pass 00:06:43 00:05:38 00:06:43
Test 2 Fail 00:06:24 00:05:56 00:06:24
Test 3 Pass 00:06:10 00:06:43 00:06:10
Test 4 Pass 00:05:25 00:05:32 00:05:25
Test 5 Fail 00:05:07 00:05:22 00:05:07
Test 6 Pass 00:02:45 00:07:26 00:02:45
我试图通过使用 pd.ExcelFile(filename)
读取 excel 文件将结果添加到列表中,然后遍历 sheet 并将数据添加到结果列表中
df = pd.read_excel(fname, None)
result=[]
for x in range(len(df.keys())):
dfx=pd.read_excel(xls, xls.sheet_names[x])
result.append(dfx)
有人可以帮我将结果合并到一个新的 sheet 中吗,因为当我使用 writer = pd.ExcelWriter(fname, engine='openpyxl')
和 df.to_excel(writer, sheet_name='Summary')
时,它会覆盖 excel 并添加一个空白sheet 姓名 Summary
。
提前致谢
我建议使用 sheet_name=None
参数创建 Ordered Dictionary of DataFrames
给所有 sheet
s:
path = "file.xlsx"
df = pd.read_excel(path, sheet_name=None)
print (df)
OrderedDict([('Run 1', Test Name Test Status Time Value
0 Test 1 PASS 00:06:43
1 Test 2 Fail 00:06:24
2 Test 3 PASS 00:06:10
3 Test 4 PASS 00:05:25
4 Test 5 Fail 00:05:07
5 Test 6 PASS 00:02:45), ('Run 2', Test Name Test Status Time Value
0 Test 1 PASS 00:05:43
1 Test 2 Fail 00:04:24
2 Test 3 PASS 00:05:10
3 Test 4 PASS 00:06:25
4 Test 5 PASS 00:03:07
5 Test 6 PASS 00:04:45), ('Run 3', Test Name Test Status Time Value
0 Test 1 PASS 00:06:40
1 Test 2 PASS 00:06:52
2 Test 3 PASS 00:05:50
3 Test 4 PASS 00:05:35
4 Test 5 PASS 00:06:17
5 Test 6 PASS 00:03:55)])
然后循环,需要concat
together with align by columns Test Name
and Test Status
, so set_index
。还为不匹配的值添加了 NaN
:
d = {k:v.set_index(['Test Name','Test Status'])['Time Value'] for k, v in df.items()}
result= pd.concat(d, axis=1).reset_index()
print (result)
Test Name Test Status Run 1 Run 2 Run 3
0 Test 1 PASS 00:06:43 00:05:43 00:06:40
1 Test 2 Fail 00:06:24 00:04:24 NaN
2 Test 2 PASS NaN NaN 00:06:52
3 Test 3 PASS 00:06:10 00:05:10 00:05:50
4 Test 4 PASS 00:05:25 00:06:25 00:05:35
5 Test 5 Fail 00:05:07 NaN NaN
6 Test 5 PASS NaN 00:03:07 00:06:17
7 Test 6 PASS 00:02:45 00:04:45 00:03:55
最后追加到新文件中的现有文件 sheet:
#
from openpyxl import load_workbook
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book
result.to_excel(writer, sheet_name = 'Status', index=False)
writer.save()
writer.close()