跟踪一个线程在未来实际工作了多长时间

Track how long a thread spent actually working on a future

我想跟踪 ThreadPoolExecutor 中的线程处理我提交到池中的一段代码所花费的时间,而不是工作项在池中所花费的时间。当然,我调用 future.result() 来获取结果,但我希望有某种方法可以调用 future.time() 或类似的方法来获取执行时间。有什么建议吗?

一种可能的方法是使用共享结构来捕获每个线程的时间执行统计信息。
考虑以下示例(并行计算 10 个连续数字的 阶乘 ):

from concurrent.futures import ThreadPoolExecutor
from threading import current_thread
from functools import partial
import time
import random
import math
import pprint

def fact(time_dict, num):
    t0 = time.time()
    res = math.factorial(num)
    time.sleep(random.randint(1, 5))

    time_dict[current_thread().name] = time.time() - t0
    return res

time_dict = {}
with ThreadPoolExecutor(max_workers=10, thread_name_prefix='thread') as executor:
    factorials = executor.map(partial(fact, time_dict), range(1, 11))
    pprint.pprint(['result: ', list(factorials)])
    pprint.pprint(['timings:', time_dict])

示例输出:

['result: ', [1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800]]
['timings:',
 {'thread_0': 2.005145788192749,
  'thread_1': 2.004167079925537,
  'thread_2': 5.0020458698272705,
  'thread_3': 4.004181146621704,
  'thread_4': 3.0027127265930176,
  'thread_5': 5.001489877700806,
  'thread_6': 3.002448797225952,
  'thread_7': 5.001359224319458,
  'thread_8': 2.005021095275879,
  'thread_9': 2.0049009323120117}]

我会编写一个简单的包装器来执行此操作:

def timed(func):
    def _w(*a, **k):
        then = time.time()
        res = func(*a, **k)
        elapsed = time.time() - then
        return elapsed, res
    return _w

然后你调用你的未来,例如使用 executor.map(timed(is_prime), PRIMES))(取自文档的示例)。

当然你需要解压生成的时间戳和结果

elapsed, result = future.result()