在 Python 中计时短路会产生意想不到的结果
Timing the shortcircuit in Python gives unexpected results
import time as dt
success = True
can_test = True
time = 0
for i in range(10000000):
start = dt.time()
if success and can_test:
stop = dt.time()
time+= stop-start
print(f'"and" operation took: {time} seconds')
time = 0
for i in range(10000000):
start = dt.time()
if success or can_test:
stop = dt.time()
time += stop-start
print(f'"or" operation took: {time} seconds')
当我运行上面的python程序时,我希望and操作比or操作慢(因为我知道短路会减少执行时间)。然而,结果不仅完全相反,而且还在波动。我能理解波动! (因为后台进程)。可为什么结果却相反!发生了什么事?
这是一个示例结果。
"and" operation took: 5.200342893600464 seconds
"or" operation took: 5.3243467807769775 seconds
这是一个有趣的问题,所以我决定深入调查您的主要顾虑。
# required modules line_profiler, matplotlib, seaborn abd scipy
import time as dt
from line_profiler import LineProfiler
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
success = True
can_test = True
def and_op():
for x in range(2000):
s = success and can_test
def or_op():
for x in range(2000):
s = success or can_test
or_op_list = []
for x in range(0,1000):
lp = LineProfiler()
lp_wrapper = lp(or_op)
lp_wrapper()
lstats = lp.get_stats()
total_time = 0
for v in lstats.timings.values():
for op in v:
total_time += op[-1]
final = op[-1]
operator = final/total_time
or_op_list.append(operator)
and_op_list = []
for x in range(0,1000):
lp = LineProfiler()
lp_wrapper = lp(and_op)
lp_wrapper()
lstats = lp.get_stats()
total_time = 0
for v in lstats.timings.values():
for op in v:
total_time += op[-1]
final = op[-1]
operator = final/total_time
and_op_list.append(operator)
sns.kdeplot(and_op_list, label = 'AND')
sns.kdeplot(or_op_list, label = 'OR')
plt.show()
print(stats.ttest_ind(and_op_list,or_op_list, equal_var = False))
p值=1.8293386245013954e-103
确实,“或”与“与”运算相比具有统计意义且不同
当我在我的机器上 运行 你的代码时,它有时会打印出 True and True
也比 True or True
快。
出现这种现象的原因是您代码中的dt.time()
在一个“微秒”的尺度(即 1000 纳秒),但是,这个 微秒尺度太稀疏,无法测量每个时间所花费的时间if success and can_test:
或 if success or can_test:
的执行。在大多数情况下,if success and can_test:
或if success or can_test:
所花费的时间是不到1微秒.
因此在您的以下部分代码中:
for i in range(10000000):
start = dt.time()
if success and can_test: # a dust particle
stop = dt.time()
time += stop - start # measured by a normal scale ruler
for i in range(10000000):
start = dt.time()
if success or can_test: # a dust particle
stop = dt.time()
time += stop - start # measured by a normal scale ruler
您的代码所做的就像用普通标尺测量每个灰尘颗粒和添加测量值。由于测量误差很大,结果失真.
为了进一步调查,如果我们执行下面的代码(d
记录所花费的时间及其频率):
import time as dt
from pprint import pprint
success = True
can_test = True
time = 0
d = {}
for i in range(10000000):
start = dt.time_ns()
if success and can_test: # a dust particle
stop = dt.time_ns()
diff_time = stop - start # measurement by a normal scale ruler
d[diff_time] = d.get(diff_time, 0) + 1
time += diff_time
print(f'"and" operation took: {time} ns')
print('"and" operation time distribution:')
pprint(d)
print()
time = 0
d = {}
for i in range(10000000):
start = dt.time_ns()
if success or can_test: # a dust particle
stop = dt.time_ns()
diff_time = stop - start # measurement by a normal scale ruler
d[diff_time] = d.get(diff_time, 0) + 1
time += diff_time
print(f'"or" operation took: {time} ns')
print('"or" operation time distribution:')
pprint(d)
它将打印如下:
"and" operation took: 1467442000 ns
"and" operation time distribution:
{0: 8565832,
1000: 1432066,
2000: 136,
3000: 24,
4000: 12,
5000: 15,
6000: 10,
7000: 12,
8000: 6,
9000: 7,
10000: 6,
11000: 3,
12000: 191,
13000: 722,
14000: 170,
15000: 462,
16000: 23,
17000: 30,
18000: 27,
19000: 10,
20000: 12,
21000: 11,
22000: 61,
23000: 65,
24000: 9,
25000: 2,
26000: 2,
27000: 3,
28000: 1,
29000: 4,
30000: 4,
31000: 2,
32000: 2,
33000: 2,
34000: 3,
35000: 3,
36000: 5,
37000: 4,
40000: 2,
41000: 1,
42000: 2,
43000: 2,
44000: 2,
48000: 2,
50000: 3,
51000: 3,
52000: 1,
53000: 3,
54000: 1,
55000: 4,
58000: 1,
59000: 2,
61000: 1,
62000: 4,
63000: 1,
84000: 1,
98000: 1,
1035000: 1,
1043000: 1,
1608000: 1,
1642000: 1}
"or" operation took: 1455555000 ns
"or" operation time distribution:
{0: 8569860,
1000: 1428228,
2000: 131,
3000: 31,
4000: 22,
5000: 8,
6000: 8,
7000: 6,
8000: 3,
9000: 6,
10000: 3,
11000: 4,
12000: 173,
13000: 623,
14000: 174,
15000: 446,
16000: 28,
17000: 22,
18000: 31,
19000: 9,
20000: 11,
21000: 8,
22000: 42,
23000: 72,
24000: 7,
25000: 3,
26000: 1,
27000: 5,
28000: 2,
29000: 2,
31000: 1,
33000: 1,
34000: 2,
35000: 4,
36000: 1,
37000: 1,
38000: 2,
41000: 1,
44000: 1,
45000: 2,
46000: 2,
47000: 2,
48000: 2,
49000: 1,
50000: 1,
51000: 2,
53000: 1,
61000: 1,
64000: 1,
65000: 1,
942000: 1}
我们可以看到大约 85.7% 的尝试测量时间(8565832 / 10000000
等于 0.8565832
和 8569860 / 10000000
等于 0.8569860
)失败了,因为它只是测量了 0
纳秒。大约 14.3% 的尝试测量时间(1432066 / 10000000
等于 0.1432066
并且 1428228/10000000
等于 0.1428228
)测量到 1000
纳秒。而且,不用说,尝试测量时间的其余部分(不到 0.1%)也导致了 1000
纳秒的销售。我们可以看到微秒尺度太稀疏,无法测量每次执行所花费的时间.
但我们仍然可以使用普通比例尺。通过收集尘粒并用尺子测量尘球。所以我们可以试试下面的代码:
import time as dt
success = True
can_test = True
start = dt.time()
for i in range(10000000): # getting together the dust particles
if success and can_test: # a dust particle
pass
stop = dt.time()
time = stop - start # measure the size of the dustball
print(f'"and" operation took: {time} seconds')
start = dt.time()
for i in range(10000000): # getting together the dust particles
if success or can_test: # a dust particle
pass
stop = dt.time()
time = stop - start # measure the size of the dustball
print(f'"or" operation took: {time} seconds')
它将打印如下:
"and" operation took: 0.6261420249938965 seconds
"or" operation took: 0.48876094818115234 seconds
或者,我们可以用一把细尺 dt.perf_counter()
可以精确测量每个灰尘颗粒的大小,如下所示:
import time as dt
success = True
can_test = True
time = 0
for i in range(10000000):
start = dt.perf_counter()
if success and can_test: # a dust particle
stop = dt.perf_counter()
time += stop - start # measured by a fine-scale ruler
print(f'"and" operation took: {time} seconds')
time = 0
for i in range(10000000):
start = dt.perf_counter()
if success or can_test: # a dust particle
stop = dt.perf_counter()
time += stop - start # measured by a fine-scale ruler
print(f'"or" operation took: {time} seconds')
它将打印如下:
"and" operation took: 1.6929048989996773 seconds
"or" operation took: 1.3965214280016083 seconds
当然,True or True
比 True and True
快!
import time as dt
success = True
can_test = True
time = 0
for i in range(10000000):
start = dt.time()
if success and can_test:
stop = dt.time()
time+= stop-start
print(f'"and" operation took: {time} seconds')
time = 0
for i in range(10000000):
start = dt.time()
if success or can_test:
stop = dt.time()
time += stop-start
print(f'"or" operation took: {time} seconds')
当我运行上面的python程序时,我希望and操作比or操作慢(因为我知道短路会减少执行时间)。然而,结果不仅完全相反,而且还在波动。我能理解波动! (因为后台进程)。可为什么结果却相反!发生了什么事?
这是一个示例结果。
"and" operation took: 5.200342893600464 seconds
"or" operation took: 5.3243467807769775 seconds
这是一个有趣的问题,所以我决定深入调查您的主要顾虑。
# required modules line_profiler, matplotlib, seaborn abd scipy
import time as dt
from line_profiler import LineProfiler
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
success = True
can_test = True
def and_op():
for x in range(2000):
s = success and can_test
def or_op():
for x in range(2000):
s = success or can_test
or_op_list = []
for x in range(0,1000):
lp = LineProfiler()
lp_wrapper = lp(or_op)
lp_wrapper()
lstats = lp.get_stats()
total_time = 0
for v in lstats.timings.values():
for op in v:
total_time += op[-1]
final = op[-1]
operator = final/total_time
or_op_list.append(operator)
and_op_list = []
for x in range(0,1000):
lp = LineProfiler()
lp_wrapper = lp(and_op)
lp_wrapper()
lstats = lp.get_stats()
total_time = 0
for v in lstats.timings.values():
for op in v:
total_time += op[-1]
final = op[-1]
operator = final/total_time
and_op_list.append(operator)
sns.kdeplot(and_op_list, label = 'AND')
sns.kdeplot(or_op_list, label = 'OR')
plt.show()
print(stats.ttest_ind(and_op_list,or_op_list, equal_var = False))
p值=1.8293386245013954e-103
确实,“或”与“与”运算相比具有统计意义且不同
当我在我的机器上 运行 你的代码时,它有时会打印出 True and True
也比 True or True
快。
出现这种现象的原因是您代码中的dt.time()
在一个“微秒”的尺度(即 1000 纳秒),但是,这个 微秒尺度太稀疏,无法测量每个时间所花费的时间if success and can_test:
或 if success or can_test:
的执行。在大多数情况下,if success and can_test:
或if success or can_test:
所花费的时间是不到1微秒.
因此在您的以下部分代码中:
for i in range(10000000): start = dt.time() if success and can_test: # a dust particle stop = dt.time() time += stop - start # measured by a normal scale ruler
for i in range(10000000): start = dt.time() if success or can_test: # a dust particle stop = dt.time() time += stop - start # measured by a normal scale ruler
您的代码所做的就像用普通标尺测量每个灰尘颗粒和添加测量值。由于测量误差很大,结果失真.
为了进一步调查,如果我们执行下面的代码(d
记录所花费的时间及其频率):
import time as dt
from pprint import pprint
success = True
can_test = True
time = 0
d = {}
for i in range(10000000):
start = dt.time_ns()
if success and can_test: # a dust particle
stop = dt.time_ns()
diff_time = stop - start # measurement by a normal scale ruler
d[diff_time] = d.get(diff_time, 0) + 1
time += diff_time
print(f'"and" operation took: {time} ns')
print('"and" operation time distribution:')
pprint(d)
print()
time = 0
d = {}
for i in range(10000000):
start = dt.time_ns()
if success or can_test: # a dust particle
stop = dt.time_ns()
diff_time = stop - start # measurement by a normal scale ruler
d[diff_time] = d.get(diff_time, 0) + 1
time += diff_time
print(f'"or" operation took: {time} ns')
print('"or" operation time distribution:')
pprint(d)
它将打印如下:
"and" operation took: 1467442000 ns
"and" operation time distribution:
{0: 8565832,
1000: 1432066,
2000: 136,
3000: 24,
4000: 12,
5000: 15,
6000: 10,
7000: 12,
8000: 6,
9000: 7,
10000: 6,
11000: 3,
12000: 191,
13000: 722,
14000: 170,
15000: 462,
16000: 23,
17000: 30,
18000: 27,
19000: 10,
20000: 12,
21000: 11,
22000: 61,
23000: 65,
24000: 9,
25000: 2,
26000: 2,
27000: 3,
28000: 1,
29000: 4,
30000: 4,
31000: 2,
32000: 2,
33000: 2,
34000: 3,
35000: 3,
36000: 5,
37000: 4,
40000: 2,
41000: 1,
42000: 2,
43000: 2,
44000: 2,
48000: 2,
50000: 3,
51000: 3,
52000: 1,
53000: 3,
54000: 1,
55000: 4,
58000: 1,
59000: 2,
61000: 1,
62000: 4,
63000: 1,
84000: 1,
98000: 1,
1035000: 1,
1043000: 1,
1608000: 1,
1642000: 1}
"or" operation took: 1455555000 ns
"or" operation time distribution:
{0: 8569860,
1000: 1428228,
2000: 131,
3000: 31,
4000: 22,
5000: 8,
6000: 8,
7000: 6,
8000: 3,
9000: 6,
10000: 3,
11000: 4,
12000: 173,
13000: 623,
14000: 174,
15000: 446,
16000: 28,
17000: 22,
18000: 31,
19000: 9,
20000: 11,
21000: 8,
22000: 42,
23000: 72,
24000: 7,
25000: 3,
26000: 1,
27000: 5,
28000: 2,
29000: 2,
31000: 1,
33000: 1,
34000: 2,
35000: 4,
36000: 1,
37000: 1,
38000: 2,
41000: 1,
44000: 1,
45000: 2,
46000: 2,
47000: 2,
48000: 2,
49000: 1,
50000: 1,
51000: 2,
53000: 1,
61000: 1,
64000: 1,
65000: 1,
942000: 1}
我们可以看到大约 85.7% 的尝试测量时间(8565832 / 10000000
等于 0.8565832
和 8569860 / 10000000
等于 0.8569860
)失败了,因为它只是测量了 0
纳秒。大约 14.3% 的尝试测量时间(1432066 / 10000000
等于 0.1432066
并且 1428228/10000000
等于 0.1428228
)测量到 1000
纳秒。而且,不用说,尝试测量时间的其余部分(不到 0.1%)也导致了 1000
纳秒的销售。我们可以看到微秒尺度太稀疏,无法测量每次执行所花费的时间.
但我们仍然可以使用普通比例尺。通过收集尘粒并用尺子测量尘球。所以我们可以试试下面的代码:
import time as dt
success = True
can_test = True
start = dt.time()
for i in range(10000000): # getting together the dust particles
if success and can_test: # a dust particle
pass
stop = dt.time()
time = stop - start # measure the size of the dustball
print(f'"and" operation took: {time} seconds')
start = dt.time()
for i in range(10000000): # getting together the dust particles
if success or can_test: # a dust particle
pass
stop = dt.time()
time = stop - start # measure the size of the dustball
print(f'"or" operation took: {time} seconds')
它将打印如下:
"and" operation took: 0.6261420249938965 seconds
"or" operation took: 0.48876094818115234 seconds
或者,我们可以用一把细尺 dt.perf_counter()
可以精确测量每个灰尘颗粒的大小,如下所示:
import time as dt
success = True
can_test = True
time = 0
for i in range(10000000):
start = dt.perf_counter()
if success and can_test: # a dust particle
stop = dt.perf_counter()
time += stop - start # measured by a fine-scale ruler
print(f'"and" operation took: {time} seconds')
time = 0
for i in range(10000000):
start = dt.perf_counter()
if success or can_test: # a dust particle
stop = dt.perf_counter()
time += stop - start # measured by a fine-scale ruler
print(f'"or" operation took: {time} seconds')
它将打印如下:
"and" operation took: 1.6929048989996773 seconds
"or" operation took: 1.3965214280016083 seconds
当然,True or True
比 True and True
快!