生成基准 table

Generate benchmark table

我已经生成了用于比较使用 ffmpeg 工具缩放视频文件的两种方法的基准。

基准测试以这种格式记录:

x.mp4 Output_Resolution : 10 p

Parameter1 : a

Method : A

real    0m5.788s
user    0m16.112s
sys     0m0.313s

Method : B, ParameterB1 : b11

ParameterB2 : b21

real    0m6.637s
user    0m16.618s
sys     0m0.720s

ParameterB2 : b22

real    0m5.486s
user    0m17.570s
sys     0m0.568s

ParameterB2 : b23

real    0m5.232s
user    0m18.212s
sys     0m0.718s

Method : B, ParameterB1 : b12

ParameterB2 : b21

real    0m6.398s
user    0m16.790s
sys     0m0.528s

ParameterB2 : b22

real    0m5.449s
user    0m17.229s
sys     0m0.533s

ParameterB2 : b23

real    0m5.275s
user    0m18.411s
sys     0m0.522s

##################################################################################################################

Parameter1 : b

Method : A

real    0m5.927s
user    0m16.451s
sys     0m0.308s

Method : B, ParameterB1 : b11

ParameterB2 : b21

real    0m6.685s
user    0m17.044s
sys     0m0.597s

ParameterB2 : b22

real    0m5.942s
user    0m18.971s
sys     0m0.804s

ParameterB2 : b23

real    0m6.119s
user    0m20.869s
sys     0m0.792s

.
.
.

有两种方法(A 和 B)。方法 A 和 B 共享 Parameter1,可以取值 a,b,c...。 方法B还有其他参数B1和B2。 ParameterB1 和 ParameterB2 分别取值 b11,b12,b13...b21,b22,b23...。行分隔符(由多个 # 组成)用于分隔 Parameter1.

不同值的测量值

我想以表格形式查看基准。

+--------+---------------------------------------+----------------+----------------+----------------+
| Method |                                       | Parameter1 (a) | Parameter1 (b) | Parameter1 (c) |
+--------+---------------------------------------+----------------+----------------+----------------+
|    A   |                   NA                  | 4.03s          | 3.23s          | 1.4s           |
+--------+-------------------+-------------------+----------------+----------------+----------------+
|    B   | ParameterB1 (b11) | ParameterB2 (b21) | .              |                |                |
|        |                   +-------------------+----------------+----------------+----------------+
|        |                   | ParameterB2 (b22) | .              |                |                |
|        |                   +-------------------+----------------+----------------+----------------+
|        |                   | ParameterB2 (b23) | .              |                |                |
|        +-------------------+-------------------+----------------+----------------+----------------+
|        | ParameterB1 (b12) | ParameterB2 (b21) | .              |                |                |
|        |                   +-------------------+----------------+----------------+----------------+
|        |                   | ParameterB2 (b22) | .              |                |                |
|        |                   +-------------------+----------------+----------------+----------------+
|        |                   | ParameterB2 (b23) | .              |                |                |
|        +-------------------+-------------------+----------------+----------------+----------------+
|        | ParameterB1 (b12) | ParameterB2 (b21) | .              |                |                |
|        |                   +-------------------+----------------+----------------+----------------+
|        |                   | ParameterB2 (b22) | .              |                |                |
|        |                   +-------------------+----------------+----------------+----------------+
|        |                   | ParameterB2 (b23) | .              |                |                |
+--------+-------------------+-------------------+----------------+----------------+----------------+

单元格值由以秒为单位的实时值组成 (real 0m6.119s)。

如何使用 python 生成这样的 table?


我几个月前在 from a 的帮助下编写了一个“效率不高”的 python 脚本。

import pprint

def gettime(x):
    m,s = map(float,x[:-1].split('m'))
    return 60 * m + s

with open("log") as fp:
    lines = fp.read().splitlines()

idx = 0
A = {}
B = {}

while idx < len(lines):
    if "Parameter1" in lines[idx]:
        Parameter1 = lines[idx].split(' ')[-1]
        temp1 = {}
        idx += 2
        if "A" in lines[idx]:
            idx += 2
            A[Parameter1] = gettime(lines[idx].split('\t')[-1])
            while idx < len(lines):
                if "B" in lines[idx]:
                    ParameterB1 = lines[idx].split(' ')[-1]
                    temp2 = {}
                    idx += 1
                    while idx < len(lines):
                        if "ParameterB2" in lines[idx]:
                            ParameterB2 = lines[idx].split(' ')[-1]
                            idx += 2
                            temp2[ParameterB2] = gettime(lines[idx].split('\t')[-1])
                        elif "#" in lines[idx] or "B" in lines[idx]:
                            break
                        idx += 1
                    temp1[ParameterB1] = temp2
                elif "#" in lines[idx]:
                    B[Parameter1] = temp1
                    break
                else:
                    idx += 1
    else:
        idx += 1
        
print("A")
print(A)

pp = pprint.PrettyPrinter(sort_dicts = False, depth = 4)
print("B")
pp.pprint(B)

此脚本解析日志并将针对各个方法和参数获得的测量值存储在字典中。

脚本输出示例:

A
{'a': 4.03, 'b': 3.23, 'c': 1.4}
B
{'a': {'b21': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
       'b22': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
       'b23': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0}},
 'b': {'b21': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
       'b22': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
       'b23': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0}},
 'c': {'b21': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
       'b22': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
       'b23': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0}}}

如何以上述表格格式打印此文件?

进一步扩展 python 脚本(在问题中)以使用漂亮的 table.

以表格格式表示存储在字典中的数据
import pprint
import io
from prettytable import PrettyTable 
# install PTable package

def gettime(x):
    m,s = map(float,x[:-1].split('m'))
    return 60 * m + s

with open("log") as fp:
    lines = fp.read().splitlines()

idx = 0
A = {}
B = {}
Parameter1_list = []
ParameterB1_list = []
ParameterB2_list = []

while idx < len(lines):
    if "Parameter1" in lines[idx]:
        Parameter1 = lines[idx].split(' ')[-1]
        Parameter1_list.append(Parameter1)
        temp1 = {}
        idx += 2
        if "A" in lines[idx]:
            idx += 2
            A[Parameter1] = gettime(lines[idx].split('\t')[-1])
            while idx < len(lines):
                if "B" in lines[idx]:
                    ParameterB1 = lines[idx].split(' ')[-1]
                    ParameterB1_list.append(ParameterB1)
                    temp2 = {}
                    idx += 1
                    while idx < len(lines):
                        if "ParameterB2" in lines[idx]:
                            ParameterB2 = lines[idx].split(' ')[-1]
                            ParameterB2_list.append(ParameterB2)
                            idx += 2
                            temp2[ParameterB2] = gettime(lines[idx].split('\t')[-1])
                        elif "#" in lines[idx] or "B" in lines[idx]:
                            break
                        idx += 1
                    temp1[ParameterB1] = temp2
                elif "#" in lines[idx]:
                    B[Parameter1] = temp1
                    break
                else:
                    idx += 1
    elif ".mp4" in lines[idx]:
        title = lines[idx]
        idx += 1
    else:
        idx += 1

#print("A")
#print(A)

#pp = pprint.PrettyPrinter(sort_dicts=False,depth=4)
#print("B")
#pp.pprint(B)

Parameter1 = list(dict.fromkeys(Parameter1_list))
ParameterB1 = list(dict.fromkeys(ParameterB1_list))
ParameterB2 = list(dict.fromkeys(ParameterB2_list))

t1 = PrettyTable(['Method','ParameterB1','ParameterB2'])
t2 = PrettyTable(Parameter1)

t1.title = title
t2.title = "Parameter1"

t1.add_row(['A','NA','NA'])
t2.add_row(A.values())

for d in ParameterB1:
    for c in ParameterB2:
        values = []
        for e in Parameter1:
            values.append(B[e][d][c])
        t1.add_row(['B',d,c])
        t2.add_row(values)

o1 = io.StringIO(t1.get_string())
o2 = io.StringIO(t2.get_string())

with open(0,"w") as f1, open('result.txt',"w") as f2:
    for x,y in zip(o1,o2):
        f1.write(x.strip()[:-1] + y.strip() + "\n")
        f2.write(x.strip()[:-1] + y.strip() + "\n")

这会将 table 写入文件 (result.txt) 和标准输出。

输出:

+------------------------------------+-------------------+
|   x.mp4  Output Resolution : 10p   |     Parameter1    |
+--------+-------------+-------------+------+------+-----+
| Method | ParameterB1 | ParameterB2 |  a   |  b   |  c  |
+--------+-------------+-------------+------+------+-----+
|   A    |      NA     |      NA     | 4.03 | 3.23 | 1.4 |
|   B    |     b11     |     b21     | 0.0  | 0.0  | 0.0 |
|   B    |     b11     |     b22     | 0.0  | 0.0  | 0.0 |
|   B    |     b11     |     b23     | 0.0  | 0.0  | 0.0 |
|   B    |     b12     |     b21     | 0.0  | 0.0  | 0.0 |
|   B    |     b12     |     b22     | 0.0  | 0.0  | 0.0 |
|   B    |     b12     |     b23     | 0.0  | 0.0  | 0.0 |
|   B    |     b13     |     b21     | 0.0  | 0.0  | 0.0 |
|   B    |     b13     |     b22     | 0.0  | 0.0  | 0.0 |
|   B    |     b13     |     b23     | 0.0  | 0.0  | 0.0 |
+--------+-------------+-------------+------+------+-----+

这是我在问题中描述的最接近表格格式的数据。