使用 python 将现有 excel 文件中来自不同 sheet 的新 sheet 中的结果摘要合并

Question

在每个运行之后，我都会得到一个包含测试结果的新 csv 文件，并且我能够将所有 excel 文件合并到一个 excel 文件中，每个运行作为 sheet 名字 .

为此我使用 xlwt

供其他人参考的代码，用于将不同的 excel 文件添加到合并的 excel 文件中：

book = xlwt.Workbook()
    for file in os.listdir(path):
        if file.endswith('csv'):
            sheet = book.add_sheet(file[:-4])
            with open(path + file) as filname:
                reader = csv.reader(filname)
                i = 0
                for row in reader:
                    for j, each in enumerate(row):
                        sheet.write(i, j, each)
                    i += 1

    book.save("consolidate_result.xls")

现在我有一个场景，我必须在 Excel.

的新摘要 sheet 中提供不同测试运行的摘要

这是我的示例 Excel 文件，其中包含多个 sheet 具有这些数据格式，第一列作为测试名称，第二列作为测试状态，第三列作为该测试的时间值：

Sheet 1 名 Run 1

Test Name   Test Status     Time Value
Test 1      PASS            00:06:43
Test 2      Fail            00:06:24
Test 3      PASS            00:06:10
Test 4      PASS            00:05:25
Test 5      Fail            00:05:07
Test 6      PASS            00:02:45

Sheet 2 名 Run 2

Test Name   Test Status     Time Value
Test 1      PASS            00:05:43
Test 2      Fail            00:04:24
Test 3      PASS            00:05:10
Test 4      PASS            00:06:25
Test 5      PASS            00:03:07
Test 6      PASS            00:04:45

Sheet 3 名 Run 3

Test Name   Test Status     Time Value
Test 1      PASS            00:06:40
Test 2      PASS            00:06:52
Test 3      PASS            00:05:50
Test 4      PASS            00:05:35
Test 5      PASS            00:06:17
Test 6      PASS            00:03:55

我想要实现的是在现有 excel 文件中获得一个新的 sheet 名称，例如 Status 或 consolidation results excel 这种格式

Test Name   Test-Status        Run 1        Run 2       Run 3
Test 1      Pass               00:06:43     00:05:38    00:06:43
Test 2      Fail               00:06:24    00:05:56     00:06:24
Test 3      Pass               00:06:10    00:06:43     00:06:10
Test 4      Pass               00:05:25    00:05:32     00:05:25
Test 5      Fail               00:05:07    00:05:22     00:05:07
Test 6      Pass               00:02:45    00:07:26     00:02:45

我试图通过使用 pd.ExcelFile(filename) 读取 excel 文件将结果添加到列表中，然后遍历 sheet 并将数据添加到结果列表中

df = pd.read_excel(fname, None)
result=[]
for x in range(len(df.keys())):
    dfx=pd.read_excel(xls, xls.sheet_names[x])
    result.append(dfx)

有人可以帮我将结果合并到一个新的 sheet 中吗，因为当我使用 writer = pd.ExcelWriter(fname, engine='openpyxl') 和 df.to_excel(writer, sheet_name='Summary') 时，它会覆盖 excel 并添加一个空白sheet 姓名 Summary。提前致谢

Answer 1

我建议使用 sheet_name=None 参数创建 Ordered Dictionary of DataFrames 给所有 sheets:

path = "file.xlsx"

df = pd.read_excel(path, sheet_name=None)
print (df)
OrderedDict([('Run 1',   Test Name Test Status Time Value
0    Test 1        PASS   00:06:43
1    Test 2        Fail   00:06:24
2    Test 3        PASS   00:06:10
3    Test 4        PASS   00:05:25
4    Test 5        Fail   00:05:07
5    Test 6        PASS   00:02:45), ('Run 2',   Test Name Test Status Time Value
0    Test 1        PASS   00:05:43
1    Test 2        Fail   00:04:24
2    Test 3        PASS   00:05:10
3    Test 4        PASS   00:06:25
4    Test 5        PASS   00:03:07
5    Test 6        PASS   00:04:45), ('Run 3',   Test Name Test Status Time Value
0    Test 1        PASS   00:06:40
1    Test 2        PASS   00:06:52
2    Test 3        PASS   00:05:50
3    Test 4        PASS   00:05:35
4    Test 5        PASS   00:06:17
5    Test 6        PASS   00:03:55)])

然后循环，需要concat together with align by columns Test Name and Test Status, so set_index。还为不匹配的值添加了 NaN：

d = {k:v.set_index(['Test Name','Test Status'])['Time Value'] for k, v in df.items()}
result= pd.concat(d, axis=1).reset_index()
print (result)
  Test Name Test Status     Run 1     Run 2     Run 3
0    Test 1        PASS  00:06:43  00:05:43  00:06:40
1    Test 2        Fail  00:06:24  00:04:24       NaN
2    Test 2        PASS       NaN       NaN  00:06:52
3    Test 3        PASS  00:06:10  00:05:10  00:05:50
4    Test 4        PASS  00:05:25  00:06:25  00:05:35
5    Test 5        Fail  00:05:07       NaN       NaN
6    Test 5        PASS       NaN  00:03:07  00:06:17
7    Test 6        PASS  00:02:45  00:04:45  00:03:55

最后追加到新文件中的现有文件 sheet:

#
from openpyxl import load_workbook

book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book

result.to_excel(writer, sheet_name = 'Status', index=False)

writer.save()
writer.close()

使用 python 将现有 excel 文件中来自不同 sheet 的新 sheet 中的结果摘要合并

Consolidating Result Summary in new sheet in same excel from different sheets on a existing excel file using python

python

excel

pandas

openpyxl