Python numpy.fft 在子进程中 运行 时非常慢(慢 10 倍)

Python numpy.fft very slow (10x slower) when run in subprocess

我发现 numpy.fft.fft(及其变体)在 运行 处于后台时非常慢。这是我所说的示例

import numpy as np
import multiprocessing as mproc
import time
import sys

# the producer function, which will run in the background and produce data
def Producer(dataQ):
    numFrames = 5
    n = 0
    while n < numFrames:
        data = np.random.rand(3000, 200)
        dataQ.put(data)   # send the datta to the consumer
        time.sleep(0.1)  # sleep for 0.5 second, so we dont' overload CPU
        n += 1            

# the consumer function, which will run in the backgrounnd and consume data from the producer
def Consumer(dataQ):
    while True:
        data = dataQ.get()
        t1 = time.time()
        fftdata = np.fft.rfft(data, n=3000*5)
        tDiff = time.time() - t1
        print("Elapsed time is %0.3f" % tDiff)
        time.sleep(0.01)
        sys.stdout.flush()

# the main program  if __name__ == '__main__': is necessary to prevent this code from being run
# only when this program is started by user
if __name__ == '__main__':     
    data = np.random.rand(3000, 200)
    t1 = time.time()
    fftdata = np.fft.rfft(data, n=3000*5, axis=0)
    tDiff = time.time() - t1
    print("Elapsed time is %0.3f" % tDiff)

    # generate a queue for transferring data between the producedr and the consumer
    dataQ = mproc.Queue(4)

    # start up the processoso
    producerProcess = mproc.Process(target=Producer, args=[dataQ], daemon=False)
    consumerProcess = mproc.Process(target=Consumer, args=[dataQ], daemon=False)
    print("starting up processes")

    producerProcess.start()
    consumerProcess.start()
    time.sleep(10) # let program run for 5 seconds

    producerProcess.terminate()
    consumerProcess.terminate()

它在我的机器上产生的输出:

Elapsed time is 0.079
starting up processes
Elapsed time is 0.859
Elapsed time is 0.861
Elapsed time is 0.878
Elapsed time is 0.863
Elapsed time is 0.758

如您所见,在后台运行 运行 时大约慢 10 倍,我不明白为什么会这样。 time.sleep() 调用应确保其他进程(主进程和生产者进程)在计算 FFT 时不做任何事情,因此它应该使用所有内核。我已经通过 Windows 任务管理器检查了 CPU 利用率,当 numpy.fft.fft 在单进程和多进程情况下被大量调用时,它似乎用掉了大约 25%。

有人知道发生了什么事吗?

主要问题是您在后台线程中的 fft 调用是:

fftdata = np.fft.rfft(data, n=3000*5)

而不是:

fftdata = np.fft.rfft(data, n=3000*5, axis=0)

这对我来说意义重大。

还有一些其他值得注意的事情。与其让 time.sleep() 无处不在,为什么不让处理器自己处理呢?此外,您可以使用

而不是暂停主线程
consumerProcess.join()

然后让生产者进程 运行 dataQ.put(None) 一旦完成加载数据,并在消费者进程中跳出循环,即:

def Consumer(dataQ):
    while True:
        data = dataQ.get()
        if(data is None):
            break
        ...